Skip to content

How to split a pandas time-series by NAN values

I have a pandas TimeSeries which looks like this:

2007-02-06 15:00:00    0.780
2007-02-06 16:00:00    0.125
2007-02-06 17:00:00    0.875
2007-02-06 18:00:00      NaN
2007-02-06 19:00:00    0.565
2007-02-06 20:00:00    0.875
2007-02-06 21:00:00    0.910
2007-02-06 22:00:00    0.780
2007-02-06 23:00:00      NaN
2007-02-07 00:00:00      NaN
2007-02-07 01:00:00    0.780
2007-02-07 02:00:00    0.580
2007-02-07 03:00:00    0.880
2007-02-07 04:00:00    0.791
2007-02-07 05:00:00      NaN   

I would like split the pandas TimeSeries everytime there occurs one or more NaN values in a row. The goal is that I have separated events.

2007-02-06 15:00:00    0.780
2007-02-06 16:00:00    0.125
2007-02-06 17:00:00    0.875

2007-02-06 19:00:00    0.565
2007-02-06 20:00:00    0.875
2007-02-06 21:00:00    0.910
2007-02-06 22:00:00    0.780

I could loop through every row but is there also a smart way of doing that???



You can use numpy.split and then filter the resulting list. Here is one example assuming that the column with the values is labeled "value":

events = np.split(df, np.where(np.isnan(df.value))[0])
# removing NaN entries
events = [ev[~np.isnan(ev.value)] for ev in events if not isinstance(ev, np.ndarray)]
# removing empty DataFrames
events = [ev for ev in events if not ev.empty]

You will have a list with all the events separated by the NaN values.
