I want to create a column in the existing dataframe with values as ‘Top’ and Bottom’, catch is, size of the dataframe changes according to calculations. For example: I will always have even number of rows. Please suggest a solution, thanks! Answer I don’t know exactly how your data is,…
Tag: dataframe
Pandas compare and sum values between two DataFrame with different size
Suppose I have two Dataframes with different sizes: to which I have: and: Now I want to add a third column to df1 say total_volume, where it is the summation of the volume that lie between individual row of xlow and xup of df1. I can do this using: we can check the value of say the second row as:
python pandas get distinct matches in columns
I have a dataframe which looks a bit like what this code gives: What I want to end up with is a list of lists or dataframe or something similar which tells me the distinct matches across both columns in both directions. It’d be something like this: I have tried to do it but I can’t get it to go
Listing path and data from a xml file to store in a dataframe
Here is a xml file : I want to save in a dataframe : 1) the path and 2) the text of the elements corresponding to the path. To do this dataframe, I am thinking to do a dictionary to store both. So first I would like to get a dictionary like that (where I have the values associated to
Apply transformation only on string columns with Pandas, ignoring numeric data
So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this? Than, after using something like df.applymap(str.lower) I would get: Currently it’s…
How to replicate same values based on the index value of other column in python
I have a dataframe like below and I want to add another column that is replicated untill certain condition is met. Now I want to add another column which contains additional information about the dataframe. For instance, I want to replicate Yes untill id is B and No when it is below B and Yes from C to D and
Dataframe increase speed of for loop for set value of column
I have dataframe from pandas (import pandas as pd) I want count +1 in ‘C3’ after rising edges (rising edges start when C1 =1 and C2=0) I tried with iterrow() for a dataframe with 300000 row, it’s little bit slow, does it have a simple way to make it more faster? Thanks a lot for your help! A…
How to take specific columns in pandas dataframe only if they exist (different CSVs)
I downloaded a bunch of football data from the internet in order to analyze it (around 30 CSV files). Each season’s game data is saved as a CSV file with different data columns. Some data columns are common to all files e.g. Home team, Away team, Full time result, ref name, etc… Earlier years CSV …
Most efficient way to find shared members of a list inside a dataframe?
Hello experts: I’m looking at so-called ‘COVID-19 bubbles’ inside pro cycling – I’ve compiled a list of riders for each team and a list of each race they’ve done. There are about 30 riders per team, and there have been a few dozen races after the sport started up again in J…
Trouble when adding values for NaN in DataFrame
I have this DataFrame: And I want to fill the NaN values with keyword taken from the description. To that end I created a list with the keywords I want: Finally, I want to loop over each row in the DataFrame. Split the contents from the “description” column in each row and, if that word is also in…