Skip to content
Advertisement

Pandas – merge multiple time columns and fill values

I have a dataframe (df):

index    time    val1    time2    val2
0        0.2     1       0.1      2
1        0.3     3       0.4      2
2        0.4     1       0.8      9
3                        1.0      2

I need to merge the time columns, sort them, then fill the missing values in a “downwards” manner. The expected output is like so:

index    time    val1    val2
0        0.1     -       2
1        0.2     1       2
2        0.3     3       2
3        0.4     1       2
4        0.8     1       9
5        1.0     1       2

Advertisement

Answer

Use wide_to_long with some preprocessing – rename time column and convert index to column, then sorting by DataFrame.sort_values, remove dupes by DataFrame.drop_duplicates and missing values by DataFrame.dropna:

df1 = (pd.wide_to_long(df.rename(columns={'time':'time1'}).reset_index(),
                      'time', i='index', j='t')
        .sort_values(['time','val2'])
        .drop_duplicates('time')
        .dropna(subset=['time'])
        .reset_index(drop=True))
print (df1)
   val1  val2  time
0   1.0     2   0.1
1   1.0     2   0.2
2   3.0     2   0.3
3   3.0     2   0.4
4   1.0     9   0.8
5   NaN     2   1.0
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement