I have the following pandas dataframe. There are many NaN but there are lots of NaN value (I skipped the NaN value to make it look shorter). I would like to filter all the NaN value and also only keep the first value out of the NaN (e.g. from index 27-29 there are three values, I would like to keep

Pandas array filter NaN and keep the first value in group

I have the following pandas dataframe. There are many NaN but there are lots of NaN value (I skipped the NaN value to make it look shorter).

0        NaN
...        
26       NaN
27     357.0
28     357.0
29     357.0
30       NaN
...
246      NaN
247    357.0
248    357.0
249    357.0
250      NaN
...
303      NaN
304     58.0
305     58.0
306     58.0
307     58.0
308     58.0
309     58.0
310     58.0
311     58.0
312     58.0
313     58.0
314     58.0
315     58.0
316      NaN
...
333      NaN
334    237.0

JavaScript
​x
 
      NaN
...        
     NaN
   357.0
   357.0
   357.0
     NaN
...
    NaN
  357.0
  357.0
  357.0
    NaN
...
    NaN
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
   58.0
    NaN
...
    NaN
  237.0
​

I would like to filter all the NaN value and also only keep the first value out of the NaN (e.g. from index 27-29 there are three values, I would like to keep the value indexed 27 and skip the 28 and 29 value). The targeted array should be as follows:

27     357.0
247    357.0
304     58.0
334    237.0

JavaScript
 
27     357.0
247    357.0
304     58.0
334    237.0
​

I am not sure how could I keep only the first value. Thanks in advance.

Answer

Take only values that aren’t nan, but the value before them is nan:

df = df[df.col1.notna() & df.col1.shift().isna()]

JavaScript
 
df = df[df.col1.notna() & df.col1.shift().isna()]
​

Output:

JavaScript
 
      col1
27   357.0
247  357.0
304   58.0
334  237.0
​

Assuming all values are greater than 0, we could also do:

df = df.fillna(0).diff()
df = df[df.col1.gt(0)]

JavaScript
 
df = df.fillna(0).diff()
df = df[df.col1.gt(0)]
​

Advertisement

Answer