Current df:
JavaScript
x
14
14
1
Date Power
2
2011-04-18 17:00:00 243.56
3
2011-04-18 17:00:01 245.83
4
2011-04-18 17:00:02 246.02
5
2011-04-18 17:00:03 245.72
6
2011-04-18 17:00:04 244.71
7
2011-04-18 17:00:05 245.93
8
2011-04-18 17:00:06 243.12
9
2011-04-18 17:00:07 244.72
10
2011-04-18 17:00:08 242.44
11
2011-04-18 17:00:09 246.42
12
2011-04-18 17:00:10 245.02
13
14
I have the df with Date and a float number. Date is the index and is unique. I would like to create a new df based on the dates found in the next df.
JavaScript
1
6
1
date start date end
2
0 2011-04-18 17:00:01 2011-04-18 17:00:02
3
1 2011-04-18 17:00:05 2011-04-18 17:00:06
4
2 2011-04-18 17:00:08 2011-04-18 17:00:10
5
6
I expect to get:
JavaScript
1
10
10
1
Date Power
2
2011-04-18 17:00:01 245.83
3
2011-04-18 17:00:02 246.02
4
2011-04-18 17:00:05 245.93
5
2011-04-18 17:00:06 243.12
6
2011-04-18 17:00:08 242.44
7
2011-04-18 17:00:09 246.42
8
2011-04-18 17:00:10 245.02
9
10
In other word I want to filter the initial df and find all rows between all the dates found in the second df.
I thought of using pandas.DataFrame.between_time. But the issue is this works only for 1 given date start and date end. How can I do this with many different date periods?
Advertisement
Answer
Use np.logical_or.reduce
with list comprehension:
JavaScript
1
13
13
1
L = [df1['Date'].between(s, e) for s, e in df2[['date start','date end']].to_numpy()]
2
3
df = df1[np.logical_or.reduce(L)]
4
print (df)
5
Date Power
6
1 2011-04-18 17:00:01 245.83
7
2 2011-04-18 17:00:02 246.02
8
5 2011-04-18 17:00:05 245.93
9
6 2011-04-18 17:00:06 243.12
10
8 2011-04-18 17:00:08 242.44
11
9 2011-04-18 17:00:09 246.42
12
10 2011-04-18 17:00:10 245.02
13
If DatetimeIndex
is possible use:
JavaScript
1
21
21
1
L = [df1[s:e] for s, e in df2[['date start','date end']].to_numpy()]
2
3
df = pd.concat(L)
4
print (df)
5
Power
6
Date
7
2011-04-18 17:00:01 245.83
8
2011-04-18 17:00:02 246.02
9
2011-04-18 17:00:05 245.93
10
2011-04-18 17:00:06 243.12
11
2011-04-18 17:00:08 242.44
12
2011-04-18 17:00:09 246.42
13
2011-04-18 17:00:10 245.02
14
15
16
17
L = [(df1.index >= s) & (df1.index <= e)
18
for s, e in df2[['date start','date end']].to_numpy()]
19
20
df = df1[np.logical_or.reduce(L)]
21