I’m looking to concatenate two columns in data frame and, where there are duplicates, append an integer number at the end. The wrinkle here is that I will keep receiving feeds of data and the increment needs to be aware of historical values that were generated and not reuse them. I’ve been trying …
Tag: pandas
Pandas union with parent ids in the same dataframe
I have a pandas dataframe that looks like this: I want to return the folder structure for each line in a separate colunm, for example line 1: A is the root there is no parent all ids are 0 = A line 2: B is under A, id = 1, so the path is A/B line 3: C is under
Iterate through a dictionary and update dataframe values
i have a dictionary and a df column contains the country code “BHR”,”SAU”,”ARE”..etc how to update this column so if it find any of the dict keys it will create new column [“TIMEZONE”] row to the dict value. also add if statement that if the row is not equal to …
Python/Pandas time series correlation on values vs differences
I am familiar with Pandas Series corr function to compute the correlation between two Series, so example: This willl compute the correlation in the VALUES of the two series, but if I’m working with a Time Series, I might want to compute teh correlation on changes (absolute changes or percentage changes …
Create Pandas DF by searching for multiple record values across multiple columns
I am trying to create a new dataframe that can pull rows based on multiple terms across multiple columns. I have a huge excel file (65k row) I am pulling into a df so that I can pull out new priority reports. So as an example, this is what I am using to search for multiple terms across 1 column
Filter column value from other columns’ values and turn the results into multiple lists Pandas
The goal is: for column 1990-1993, if value == 1, return Country to 4 lists, I also want to set each list a #name of the year and don’t know how to do that. here is my try: I got the output as 4 lists of nans… The desired output would be Answer One way using dict comprehension with groupby
Converting a named aggregate prior to pandas/python3
For the below: What would be the proper way to do this before named aggregates were introduced, including the aliasing of columns? Answer As mentioned by Henry Yik, use .agg() followed by .rename(). For example:
Python dataframe vectorizing for loop
I would like to vectorize this piece of python code with for loop conditioned on current state for speed and efficiency. values for df_B are computed based on current-state (state) AND corresponding df_A value. Any ideas would be appreciated. Answer This seems overkill. Your state variable basically is the pr…
pandas countif negative using where()
Below is the code and output, what I’m trying to get is shown in the “exp” column, as you can see the “countif” column just counts 5 columns, but I want it to only count negative values. So for example: index 0, df1[0] should equal 2 What am I doing wrong? Python Output Answer Fi…
Sort categorical x-axis in a seaborn scatter plot
I am trying to plot the top 30 percent values in a data frame using a seaborn scatter plot as shown below. The reproducible code for the same plot: Here, I want sort the x-axis in order = [‘virginica’,’setosa’,’versicolor’]. When I tried to use order as one of the parameter…