I have a pandas TimeSeries which looks like this:
2007-02-06 15:00:00 0.780 2007-02-06 16:00:00 0.125 2007-02-06 17:00:00 0.875 2007-02-06 18:00:00 NaN 2007-02-06 19:00:00 0.565 2007-02-06 20:00:00 0.875 2007-02-06 21:00:00 0.910 2007-02-06 22:00:00 0.780 2007-02-06 23:00:00 NaN 2007-02-07 00:00:00 NaN 2007-02-07 01:00:00 0.780 2007-02-07 02:00:00 0.580 2007-02-07 03:00:00 0.880 2007-02-07 04:00:00 0.791 2007-02-07 05:00:00 NaN
I would like split the pandas TimeSeries everytime there occurs one or more NaN values in a row. The goal is that I have separated events.
Event1: 2007-02-06 15:00:00 0.780 2007-02-06 16:00:00 0.125 2007-02-06 17:00:00 0.875 Event2: 2007-02-06 19:00:00 0.565 2007-02-06 20:00:00 0.875 2007-02-06 21:00:00 0.910 2007-02-06 22:00:00 0.780
I could loop through every row but is there also a smart way of doing that???
Advertisement
Answer
You can use numpy.split
and then filter the resulting list. Here is one example assuming that the column with the values is labeled "value"
:
events = np.split(df, np.where(np.isnan(df.value))[0]) # removing NaN entries events = [ev[~np.isnan(ev.value)] for ev in events if not isinstance(ev, np.ndarray)] # removing empty DataFrames events = [ev for ev in events if not ev.empty]
You will have a list with all the events separated by the NaN
values.