Skip to content
Advertisement

ffill col[c] based on col[a]==Value

I have a dataframe [pixel, total_time], i want to:

  • Make a new column “total_time_one”, which takes total_time of pixel 1 and projects it
 pixel  total_time  total_time_one
     1     218.835         218.835 #projected times of pixel 1 onto all valyues
     1     218.835         218.835
     1     218.835         218.835
     2     219.878         218.835
     2     219.878         218.835
     2     219.878         218.835
     3     220.911         218.835
     3     220.911         218.835
     3     220.911         218.835
     1     230.189         230.189 #value changes cause pixel 1 shows up again
     1     230.189         230.189
     1     230.189         230.189
     2     231.441         230.189
     2     231.441         230.189
     2     231.441         230.189

I have acheved the above dataframe with :

uniqueone = df.query("pixel==1").total_time.unique()
mask = df["total_time"].isin(uniqueone)
df["total_time_one"] = (df[mask]["total_time"])#putting it here isn't working: .fillna(method='ffill')
df["total_time_one"] = df["total_time_one"].fillna(method='ffill')

Howver the code is quite long and repeats itself, is there a function better suited? or a better solution?

Also i do not undestand why if i put:

df["total_time_one"] = (df[mask]["total_time"].fillna(method='ffill')

It doens’t work, and i have to put an extra line:

df["total_time_one"] = df["total_time_one"].fillna(method='ffill')

to make it work

Advertisement

Answer

Use where to NaN the values that aren’t 'pixel'==1 and then ffill. This technically forward fills the last value of each group, but your values are static within each group of pixels.

df['total_time_one'] = df['total_time'].where(df.pixel.eq(1)).ffill()

    pixel  total_time  total_time_one
0       1     218.835         218.835
1       1     218.835         218.835
2       1     218.835         218.835
3       2     219.878         218.835
4       2     219.878         218.835
5       2     219.878         218.835
6       3     220.911         218.835
7       3     220.911         218.835
8       3     220.911         218.835
9       1     230.189         230.189
10      1     230.189         230.189
11      1     230.189         230.189
12      2     231.441         230.189
13      2     231.441         230.189
14      2     231.441         230.189

If you wanted to use the first value within each group (as opposed to the last), or say take the average and then ffill you can use groupby + transform. You label successive groups of 1s using the cumsum of a != comparison.

df['total_time_one'] =  (df['total_time'].where(df.pixel.eq(1))
                            .groupby(df.pixel.ne(1).cumsum())
                            .transform('first')  
                            .ffill())
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement