Tag: pandas

Percentage append next to value counts in Dataframe

append for-loop pandas percentage python

I’m trying to create a excel with value counts and percentage, I’m almost finishing but when I run my for loop, the percentage is added like a new df.to_frame with two more columns but I only want one this is how it looks in excel: I want that the blue square not appears in the excel or the df and

Python Pandas – Update column values based on two lists

pandas python

Suppose I have a table like this Company_Name Product A Apple B Orange C Pear D Lemon Given two lists list1 = [‘Pear’, ‘Lemon’, ‘Apple’, ‘Orange’] list2 = [1, 2, 3, 4] How to replace Product name with the numerical values? The output should look like this &#8211…

Pandas apply condition on a column that contains list

pandas python

I want to create a new column based on a condition that have to be applied on a list. Here’s a reproducible example: As one can see, each object in the BRAND column is a list that can contain one or more elements (the list can also be empty as for the row where ID = 1). Now, given the

Filter and apply multiple conditions between multiple rows

pandas python

I have the following dataframe: What I’m trying to do, is to implement the following logic: If there’s only one record (consider the combination of location + method), I’ll just do nothing. That’s the scenario for the first and last row. If there’s more than one record (location …

Using Pandas df.loc

dataframe pandas pandas-loc python

I have a DataFrame of a csv file which is being read by pandas. What I am attempting to do is use df.loc to add a new column but only insert values into the column when values from another column, called “SKU” end with “-RF” and “-NEW”. The code I was working on is below. I…

Substitute numbers in a list of type object pandas

dataframe pandas python

I have a dataframe df looking as follows: What I would like to do is to substitute into df[‘cited_ids’] 0 whenever the corresponding id has d=0 (i) and replace d=1 if there is at least one 0 in the list of df[‘cited_ids’] and the previous d was not 0 (ii). In other words, the first ste…

Applying function to Column AttributeError: ‘int’ object has no attribute

attributeerror pandas python

I have a pandas data frame that consists of special/vanity numbers. I would like to add a column to classify each number based on its pattern using regex. I have written a function that iterates through the column MNM_MOBILE_NUMBER. Identifies the pattern of each number using regex. Then, creates a new column…

Pandas take number out string

dataframe pandas python

In my data, I have this column “price_range”. Dummy dataset: I am using pandas. What is the most efficient way to get the upper and lower bound of the price range in seperate columns? Answer Alternatively, you can parse the string accordingly (if you want to limits for each row, rather than the to…

Python, comparing dataframe rows, adding new column – Truth Value Error?

dataframe if-statement pandas python valueerror

I am quite new to Python, and somewhat stuck here. I just want to compare floats with a previous or forward row in a dataframe, and mark a new column accordingly. FWIW, I have 6000 rows to compare. I need to output a string or int as the result. My Python code: I get the error: ValueError: The truth value

Identify pairs of events then calculate the time elapsed between events

dataframe datetime pandas python

I have a dataframe with messages sent and received. I want to calculate the time it took for someone to reply to the message. The method I thought of using was identifying pairs, so if sent =A and received =B, then there should be another entry with sent=B and received =A. Then once I identify the pairs, I ca…