Tag: dataframe

python pandas dataframe : fill nans with a conditional mean of previous and next value

I have the following dataframe: And I want value NaN to be filled with the conditional mean of previous and next value based on the same column. Just like this, value 6 is the mean with 5 and 7. And this is a little part of my dataframe, so I need to replace all the NaN. Answer EDIT: For replace

Concat string in column values where it is missing in Python

dataframe pandas python string

I have a dataframe I want to append string chr in column CHROM where it’s missing. I can do it in R with grepl and paste, but wanted to try in Python. I came up with these two commands, but not sure how to index the column because pd.Series is generating NaNs. Answer String operations in pandas are not optimized,

Pandas Join Two Dataframes According to Range and Date

dataframe pandas python

I have two dataframes like this: I want to bring the RATE values to the second df in accordance with the DATE. Also, the AMOUNT and DAY values in the relevant DATE must be within the appropriate range (MAX_AMOUNT & MIN_AMOUNT, MAX_DAY & MIN_DAY). Desired output like this: Could you please help me about this? Answer Use merge first with

replace whitespace with comma in multiline string (doc string), but keeping end-of-line

dataframe pandas python whitespace

I have a multiline string (and not a text file) like this: The column white spaces are unequal. I want to replace the whitespace with a comma, but keep the end-of-line. So the result would look like this: …or alternatively as a pandas dataframe. what i have tried I can use replace(”) with different spaces, but need to count the

How to check if a row of a Pandas dataframe has a cell with a specific value and if it does modify the last cell?

dataframe pandas python

I have a dataframe df: name age_5_9 age_10_14 age_15_19 Alice no bones broken no bones broken broke 1 bone Bob no bones broken broke 2 bones no bones broken Charles no bones broken no bones broken no bones broken I would like to create a column broke_a_bone that is 1 when any of the rows has a value ‘broke 1

Vectorization assign the newest value based on datetime

dataframe pandas python vectorization

I have two dataframe. The first dataframe have only one column: email, the first dataframe is a complete list of email. The second dataframe is a dataframe with three column: email, subscribe_or_unsubscribe, date. The second dataframe is a history of user subcribing or unsubscribing from the email system. The second dataframe is sorted by date with oldest date at index

efficient way to find the most recent entry in another dataframe for each entry of a dataframe indexed by datetime in pandas

dataframe pandas python

I have two dataframes, and both of them are indexed by datetime. for example, the dataframe 1 is something below: and the dataframe 2 looks like: For each entry in dataframe 1, I want to find the most recent one entry in dataframe 2, and create a new column in dataframe 1 to setup the relationship between the two dataframes.

Pandas convert dummies to a new column

dataframe pandas pandas-merge python

I have a dataframe that discretize the customers into different Q’s, which looks like: What I want to do is adding a new column, Q, to the dataframe which shows which sector this customer is in, so it looks like: The only way I can think about is using for loop but it will give me a mess. Any other

Merge Dataframe rows based on the date

dataframe pandas python

I have a dataframe that looks like this, It has the name of the company, the date and the title of a headline that was published regarding that company on that day. There are multiple headlines published on that single day and every single one of those headlines take up a different row even for the same date. What I

Groupby several columns, summing them up based on the presence of a sub-string

dataframe pandas python

Context: I’m trying to sum all values based in a list only if they start with or contain a string So with a config file like this: And a dataframe like this: How can I group by if they all start by a given substring present on the granularity_suffix_list? Desired output: Attempts: I was trying this: But It doesn’t work.