I have a dataframe that looks like this I need to get every value in this column DEP_TIME to have the format hh:mm. All cells are of type string and can remain that type. Some cells are only missing the colon (rows 0 to 3), others are also missing the leading 0 (rows 4+). Some cells are empty and should
Tag: dataframe
How to select rows where date is in index in Python Pandas DataFrame?
I have DataFrame in Pythonlike below where data is in index (we can name this column “date”): and I would like to select all column of this DF where data in index is > than 01.01.2020, how can I do it? (be aware that date is in index). Answer Use boolean indexing: Or:
Split a dataframe based on a specifc cumsum value
I have a solution working, but it seems cumbersome and I am wondering if there is a better way to achieve what I want. I need to achieve two things: Split a dataframe into two dataframes based on a specifc cumsum value. If a row needs to be split to fulfill the cumsum condition, than this must happen. An example
Pandas average of previous rows fulfilling condition
I have a huge data-frame (>20m rows) with each row containing a timestamp and a numeric variable X. I want to assign a new column where for each row the value in this new column is the average of X in the previous rows within a specified time window e.g the average of all rows with time stamps no more
Using the items of a df as a header of a diffeerent dataframe
I have 2 dataframes and df2= I want to use df1 as a header of df2 so that df1 is either the header of the columns or the first raw. i have multiple columns so it will not work to do df2.columns=[“_A1-Site_0_norm”, “_A1-Site_1_norm”] I thought of making a list of all the items present in the df1 to the use
Sorting a table in python with alphabet and numbers
I have the following table: Column1 Column2 99 QA 65 CD 134 LL N12 OO 127 KK Q23 MM 1 AA A10 KL K9 MA I would like to sort the table such that the numbers are sorted in descending order first then the alphabets in descending order. How do I do that? The output should look something like the
Recode multiple values in several columns in Python [similar to R]
I am trying to translate my R script to python. I have a survey data with several date of birth and education level columns for each family member(from family member 1 to member 10): here a sample: I had a function in R in order to check the logic and re code wrong education level in all columns.Like this and
Merging pandas columns into a new column
Suppose I have a dataframe as follows how can I merge the two columns into one using pandas? The desired output is output Thank you! Answer Use Series.fillna with DataFrame.pop for replace missing values to another column with drop second column: Or you can back filling missing values with select first column by DataFrame.iloc with [[0]] for one column DataFrame
Dataframes from dictionnaries in nested lists – Python
I have for example a nested list of 2 lists containing dictionaries like this : [[{},{},{},{}],[{},{},{},{}]] I would like 2 dataframes something like : And I obviously can’t use that : with data as a list of dictionaries. Answer The solution is simply to flatten your list of lists, then you can pass it to pandas normally Will get you
How can I insert rows to Pandas dataframe depending on previous and next values?
I want to insert a row if the time values between the previous and next rows are high. Essentially I want to have a row for every 2 seconds. So in the below example I want to add 3 rows between 19 and 26. The time values will be 21, 23, 25 and I will later use interpolate method to