Tag: pandas

Pandas: I want make a new column based on a series

What I want is just to add a column that copy the value of tmp with respect to serial number of c2 and map to c1. Expected result: The length of c1 sequence and c2 sequence are the same. Longer sequence for reproduct: Answer Use Series.map with DataFrame.drop_duplicates, because c2 has duplicates: Details: So…

SUM specific column values that have integers where row meets a condition and place in new row

numpy pandas python

I wish to SUM the first two column values that have numbers where row == ‘BB’ and place in new row below Data Desired Doing I am taking the sum row 2 and columns 1 and 2 Any suggestion is appreciated. Answer you can use rolling window to calculate the sum of pair of two columns, then concat the re…

transform a list of datetimes in a pandas column to list of strings

pandas python

I have the following pandas dataframe I would like to transform each element of the lists in the time column to a string with the format ‘%Y/%m/%d %H:%M:%S’ I know I can do this: to yield the value ‘2021/10/20 14:29:51’, but I do not know how to do this operation for every string eleme…

How to find the index of certain lists in a list of lists?

csv numpy pandas python

I have 50 folders containing the same file name but different contents Data_220_beta_0.1_47.0_53.0ND.csv. I am skipping certain folders which is mentioned in list I. Now, when the code scans all the remaining folders, it looks for values which are different and X = [x for x in X if min(x) != max(x)] contains …

python pandas dataframe : fill nans with a conditional mean of previous and next value

dataframe mean nan pandas python

I have the following dataframe: And I want value NaN to be filled with the conditional mean of previous and next value based on the same column. Just like this, value 6 is the mean with 5 and 7. And this is a little part of my dataframe, so I need to replace all the NaN. Answer EDIT: For replace

How do I plot a histogram with one of the columns along the x axis?

jupyter pandas python

I have a DataFrame quantities as follows: How do I make a histogram out of it with date along x axis, total quantity for the date on y axis? I tried But the resulting histogram didn’t make any sense to me. Edit. Long day. I got it all wrong. Per comments, it looks like I what I need is a

Create Pandas DataFrame column which joins column names for any non na values

pandas python

How do I create a new column which joins the column names for any non na values on a per row basis. Please note the duplicate index. Code Example DF Desired output is a new column which joins the column names for non na values as per col_names example below. Answer Try with dot

Add one day for certain time

datetime pandas python time

I want to update my date column with certain times because some of the dates are not correct. For some reason, they key in the date with time between 00:00 and 7:30 with the day before. For example: Which supposes to be like this: I know I can update all of dates with this code. But I have no idea

Pandas multiple comparison on a single row

pandas python

My source data looks like this: I need to compare Second column with Third, Fourth Column with Fifth Column, Sixth with Seventh. Column names can change. So I have to consider the column positions and my first column with always has column name as id. so if atleast one of comparisons (‘1_src1’ vs …

pandas to_excel converts _x10e6 _ to ღ. How do I avoid this?

databricks delta-lake export-to-excel pandas python

I have been trying to create an excel file with several sheets from delta tables, however some of my column names include _x10e6 _ which is apparently translated to ღ. I have tried to use encoding=’unicode_escape’ and encoding=’utf-8′ without luck. I cannot use xlsxwriter because I am …