Assume we have quite a few of .xls or .xlsx files stored in a directory, and there are two ways of feeding them into pd.concat to get one big table: yield vs append. Judging by %%timeit magic, both are pretty much the same? tested on 100 xls/xlsx files If there’s a difference between these two, which on…
Tag: pandas
Turn multiple columns into two new columns in a dataframe using Pandas
I am working in a python pandas environment :D Currently, I have a dataframe that looks like this : My goal is to make the dataframe look like this : Basically, I want the last 9 column titles and values to become their own rows on 2 new columns while keeping the first 8 columns and rows the same. I
How to print the current row number when using .apply on DataFrame
I’ve seen this question for R, but not for python. Basically, I have a large DataFrame where I apply a function row-wise. It takes a very long time to run and I hoped to put a print statement to show where I am. I put together an example of what I would like to do. I know an alternative, but
How to extract elements from a filename and move them to different columns?
I have a filenames which I converted into a list. The list has the following elements: My goal is to extract elements from this list and fill out a dataframe, which should look like this: LINK TO THE GOOGLE SHEETS CONTAINING THE IMAGE ABOVE: https://docs.google.com/spreadsheets/d/1kuX3M4RFCNWtNoE7Hm1ejxWMwF-C…
How to use Excel’s SUMIF function in Pandas
I have a difficulty in calculating “total_sum.” If someone didn’t apply to subject, I expressed N/A. When total_sum is calculated, total_sum refer to Standard field and N/A is excluded. I’m not good at Python, So I don’t know how to calculate “total_sum” Answer Suppos…
Finding a specific string in a Dataframe column
I’m trying to retrieve one row of data from my Dataframe created from a csv file accessed via URL. I’m using… df1[‘Statistic Element’].str.contains(‘Mean rainfall’) …to determine the row containing the data I require however python does not recognize the .str el…
How to use pandas to create a column that stores count of first occurrences on a group-by?
Q1. Given data frame 1, I am trying to get group-by unique new occurrences & another column that gives me existing ID count per month Expected output for unique newly added group-by ID values & for existing sum of ID values Note: Mar-2020 ID_Count is ZERO because ID 1, 2, and 3 were present in previou…
How do you identify which IDs have an increasing value over time in another column in a Python dataframe?
Lets say I have a data frame with 3 columns: ID column contains the ID of a particular person. Value column contains the value of their transaction. Date column contains the date of their transaction. Is there a way in Python to identify ID 1 as the ID with the increasing value of transactions over time? I…
Groupby names replace values with there max value in all columns pandas
I have this DataFrame which looks like this I want this replaced all values with the maximum value. we choose the maximum value from both val1 and val2 if i do this i will get the maximum from only val1 Answer Try using pd.wide_to_long to melt that dataframe into a long form, then use groupby with transform t…
Count Number of Rows within Time Interval in Pandas Dataframe
Say we have this data: I want to count, for each year, how many rows (“index”) fall within each year, but excluding the Y0. So say we start at the first available year, 1990: How many rows do we count? 0. 1991: Three (row 1, 2, 3) 1992: Four (row 1, 2, 3, 4) … 2009: Four (row 1, 2,