I have a dataset from the last 3 years, I would like to add a new column based on holidays. when I try this :
import holidays de_holidays = holidays.DE() for date, name in sorted(holidays.DE(years=2021).items()): print(date, name)
I get the result
2021-01-01 Neujahr 2021-04-02 Karfreitag 2021-04-05 Ostermontag 2021-05-01 Erster Mai 2021-05-13 Christi Himmelfahrt 2021-05-24 Pfingstmontag 2021-10-03 Tag der Deutschen Einheit 2021-12-25 Erster Weihnachtstag 2021-12-26 Zweiter Weihnachtstag
now I wanted to create a new column in my existing dataset with true/false in case of holiday. I tried to use the below code snippet.
My Date column looks something like this: Dtype is datetime64[ns] 2021-07-22 2021-07-21 2021-07-20 2021-07-19 #I used the code import holidays de_holidays = holidays.DE() df['Holiday'] = df['Date'].isin(de_holidays) rslt_df rslt_df.loc[rslt_df['Date'] == '2021-05-13']
The result I was expecting is True as 13th may was a holiday but I realized this code is giving all the false values. can anyone help?
edit
12390 2021-07-22 12380 2021-07-21 12370 2021-07-20 12360 2021-07-19 12350 2021-07-18 ... 40 2018-03-05 30 2018-03-04 20 2018-03-03 10 2018-03-02 0 2018-03-01 Name: Date, Length: 1240, dtype: datetime64[ns]
now when I use
df['Holiday'] = df['Date'].isin(holidays.DE(years=2021))
I get the correct True/False values but as soon as I remove years tab then I get all the false value
df['Holiday'] = df['Date'].isin(holidays.DE())
Advertisement
Answer
This works well to get Boolean Value
from datetime import date import holidays de_holidays = holidays.DE() #date(2021-07-22) in de_holidays rslt_df['Holiday'] = rslt_df['Date'].isin(holidays.DE(years=[2018,2019,2020,2021])) rslt_df