Tag: dataframe

Pandas dataframe custom formatting string to time

I have a dataframe that looks like this I need to get every value in this column DEP_TIME to have the format hh:mm. All cells are of type string and can remain that type. Some cells are only missing the colon (rows 0 to 3), others are also missing the leading 0 (rows 4+). Some cells are empty and should

How to select rows where date is in index in Python Pandas DataFrame?

dataframe pandas python

I have DataFrame in Pythonlike below where data is in index (we can name this column “date”): and I would like to select all column of this DF where data in index is > than 01.01.2020, how can I do it? (be aware that date is in index). Answer Use boolean indexing: Or:

Split a dataframe based on a specifc cumsum value

cumsum dataframe pandas python

I have a solution working, but it seems cumbersome and I am wondering if there is a better way to achieve what I want. I need to achieve two things: Split a dataframe into two dataframes based on a specifc cumsum value. If a row needs to be split to fulfill the cumsum condition, than this must happen. An example

Pandas average of previous rows fulfilling condition

dataframe numpy pandas performance python

I have a huge data-frame (>20m rows) with each row containing a timestamp and a numeric variable X. I want to assign a new column where for each row the value in this new column is the average of X in the previous rows within a specified time window e.g the average of all rows with time stamps no more

Using the items of a df as a header of a diffeerent dataframe

dataframe header python

I have 2 dataframes and df2= I want to use df1 as a header of df2 so that df1 is either the header of the columns or the first raw. i have multiple columns so it will not work to do df2.columns=[“_A1-Site_0_norm”, “_A1-Site_1_norm”] I thought of making a list of all the items present in the df1 to the use

Sorting a table in python with alphabet and numbers

dataframe pandas python sorting

I have the following table: Column1 Column2 99 QA 65 CD 134 LL N12 OO 127 KK Q23 MM 1 AA A10 KL K9 MA I would like to sort the table such that the numbers are sorted in descending order first then the alphabets in descending order. How do I do that? The output should look something like the

Recode multiple values in several columns in Python [similar to R]

dataframe numpy pandas python r

I am trying to translate my R script to python. I have a survey data with several date of birth and education level columns for each family member(from family member 1 to member 10): here a sample: I had a function in R in order to check the logic and re code wrong education level in all columns.Like this and

Merging pandas columns into a new column

dataframe pandas python python-3.x

Suppose I have a dataframe as follows how can I merge the two columns into one using pandas? The desired output is output Thank you! Answer Use Series.fillna with DataFrame.pop for replace missing values to another column with drop second column: Or you can back filling missing values with select first column by DataFrame.iloc with [[0]] for one column DataFrame

How can I insert rows to Pandas dataframe depending on previous and next values?

dataframe pandas python

I want to insert a row if the time values between the previous and next rows are high. Essentially I want to have a row for every 2 seconds. So in the below example I want to add 3 rows between 19 and 26. The time values will be 21, 23, 25 and I will later use interpolate method to