I have the following pandas dataframe. There are many NaN but there are lots of NaN
value (I skipped the NaN
value to make it look shorter).
JavaScript
x
32
32
1
0 NaN
2
3
26 NaN
4
27 357.0
5
28 357.0
6
29 357.0
7
30 NaN
8
9
246 NaN
10
247 357.0
11
248 357.0
12
249 357.0
13
250 NaN
14
15
303 NaN
16
304 58.0
17
305 58.0
18
306 58.0
19
307 58.0
20
308 58.0
21
309 58.0
22
310 58.0
23
311 58.0
24
312 58.0
25
313 58.0
26
314 58.0
27
315 58.0
28
316 NaN
29
30
333 NaN
31
334 237.0
32
I would like to filter all the NaN
value and also only keep the first value out of the NaN
(e.g. from index 27-29 there are three values, I would like to keep the value indexed 27 and skip the 28 and 29 value). The targeted array should be as follows:
JavaScript
1
5
1
27 357.0
2
247 357.0
3
304 58.0
4
334 237.0
5
I am not sure how could I keep only the first value. Thanks in advance.
Advertisement
Answer
Take only values that aren’t nan, but the value before them is nan:
JavaScript
1
2
1
df = df[df.col1.notna() & df.col1.shift().isna()]
2
Output:
JavaScript
1
6
1
col1
2
27 357.0
3
247 357.0
4
304 58.0
5
334 237.0
6
Assuming all values are greater than 0, we could also do:
JavaScript
1
3
1
df = df.fillna(0).diff()
2
df = df[df.col1.gt(0)]
3