I have a pandas TimeSeries which looks like this:
JavaScript
x
16
16
1
2007-02-06 15:00:00 0.780
2
2007-02-06 16:00:00 0.125
3
2007-02-06 17:00:00 0.875
4
2007-02-06 18:00:00 NaN
5
2007-02-06 19:00:00 0.565
6
2007-02-06 20:00:00 0.875
7
2007-02-06 21:00:00 0.910
8
2007-02-06 22:00:00 0.780
9
2007-02-06 23:00:00 NaN
10
2007-02-07 00:00:00 NaN
11
2007-02-07 01:00:00 0.780
12
2007-02-07 02:00:00 0.580
13
2007-02-07 03:00:00 0.880
14
2007-02-07 04:00:00 0.791
15
2007-02-07 05:00:00 NaN
16
I would like split the pandas TimeSeries everytime there occurs one or more NaN values in a row. The goal is that I have separated events.
JavaScript
1
11
11
1
Event1:
2
2007-02-06 15:00:00 0.780
3
2007-02-06 16:00:00 0.125
4
2007-02-06 17:00:00 0.875
5
6
Event2:
7
2007-02-06 19:00:00 0.565
8
2007-02-06 20:00:00 0.875
9
2007-02-06 21:00:00 0.910
10
2007-02-06 22:00:00 0.780
11
I could loop through every row but is there also a smart way of doing that???
Advertisement
Answer
You can use numpy.split
and then filter the resulting list. Here is one example assuming that the column with the values is labeled "value"
:
JavaScript
1
6
1
events = np.split(df, np.where(np.isnan(df.value))[0])
2
# removing NaN entries
3
events = [ev[~np.isnan(ev.value)] for ev in events if not isinstance(ev, np.ndarray)]
4
# removing empty DataFrames
5
events = [ev for ev in events if not ev.empty]
6
You will have a list with all the events separated by the NaN
values.