Skip to content
Advertisement

Pandas Dataframe keep rows where date is between two dates (seperate columns)

I have a dataframe that looks similar to this :

Price        From           To
 300€        2020-01-01     2020-01-07
 250€        2020-01-04     2020-01-08
 150€        2020-02-01     2020-02-04
 350€        2020-02-04     2020-02-08

And then I have a list of dates. For example: list = [2020-01-03, 2020-02-04]

I would like to keep only the rows of the dataframe where the dates are in between the From column and the To column.

So, after transformation I would have the following dataframe.

Price        From           To
 300€        2020-01-01     2020-01-07
 150€        2020-02-01     2020-02-04
 350€        2020-02-04     2020-02-08

First I thought of using a lambda with an apply but I thought it was not very efficient because my dataset is very large. Is there a simpler way to do this with pandas ?

The result would be contained in one single dataframe

Advertisement

Answer

Let’s try with numpy broadcasting:

x, y = df[['From', 'To']].values.T
a = np.array(['2020-01-03', '2020-02-04'], dtype=np.datetime64)
mask = ((x[:, None] <= a) & (y[:, None] >= a)).any(1)

df[mask]

  Price       From         To
0  300€ 2020-01-01 2020-01-07
2  150€ 2020-02-01 2020-02-04
3  350€ 2020-02-04 2020-02-08
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement