I have a dataframe [pixel, total_time], i want to:
- Make a new column “total_time_one”, which takes total_time of pixel 1 and projects it
JavaScript
x
17
17
1
pixel total_time total_time_one
2
1 218.835 218.835 #projected times of pixel 1 onto all valyues
3
1 218.835 218.835
4
1 218.835 218.835
5
2 219.878 218.835
6
2 219.878 218.835
7
2 219.878 218.835
8
3 220.911 218.835
9
3 220.911 218.835
10
3 220.911 218.835
11
1 230.189 230.189 #value changes cause pixel 1 shows up again
12
1 230.189 230.189
13
1 230.189 230.189
14
2 231.441 230.189
15
2 231.441 230.189
16
2 231.441 230.189
17
I have acheved the above dataframe with :
JavaScript
1
5
1
uniqueone = df.query("pixel==1").total_time.unique()
2
mask = df["total_time"].isin(uniqueone)
3
df["total_time_one"] = (df[mask]["total_time"])#putting it here isn't working: .fillna(method='ffill')
4
df["total_time_one"] = df["total_time_one"].fillna(method='ffill')
5
Howver the code is quite long and repeats itself, is there a function better suited? or a better solution?
Also i do not undestand why if i put:
JavaScript
1
2
1
df["total_time_one"] = (df[mask]["total_time"].fillna(method='ffill')
2
It doens’t work, and i have to put an extra line:
JavaScript
1
2
1
df["total_time_one"] = df["total_time_one"].fillna(method='ffill')
2
to make it work
Advertisement
Answer
Use where
to NaN
the values that aren’t 'pixel'==1
and then ffill
. This technically forward fills the last value of each group, but your values are static within each group of pixels.
JavaScript
1
2
1
df['total_time_one'] = df['total_time'].where(df.pixel.eq(1)).ffill()
2
JavaScript
1
17
17
1
pixel total_time total_time_one
2
0 1 218.835 218.835
3
1 1 218.835 218.835
4
2 1 218.835 218.835
5
3 2 219.878 218.835
6
4 2 219.878 218.835
7
5 2 219.878 218.835
8
6 3 220.911 218.835
9
7 3 220.911 218.835
10
8 3 220.911 218.835
11
9 1 230.189 230.189
12
10 1 230.189 230.189
13
11 1 230.189 230.189
14
12 2 231.441 230.189
15
13 2 231.441 230.189
16
14 2 231.441 230.189
17
If you wanted to use the first
value within each group (as opposed to the last), or say take the average and then ffill
you can use groupby
+ transform
. You label successive groups of 1
s using the cumsum of a !=
comparison.
JavaScript
1
5
1
df['total_time_one'] = (df['total_time'].where(df.pixel.eq(1))
2
.groupby(df.pixel.ne(1).cumsum())
3
.transform('first')
4
.ffill())
5