I would like to name the period of the day based on hourly information to my dataframe.
For this, I am attempting the following:
day_period = [] for index,row in df.iterrows(): hour_series = row["hour"] # Morning = 04:00-10:00 #if hour_series >= 4 and hour_series < 10: if 4 >= hour_series < 10: day_period_str = "Morning" day_period.append(day_period_str) # Day = 10:00-16:00 #if hour_series >= 10 and hour_series < 16: if 10 >= hour_series < 16: day_period_str = "Day" day_period.append(day_period_str) # Evening = 16:00-22:00 #if hour_series >= 16 and hour_series < 22: if 16 >= hour_series < 22: day_period_str = "Evening" day_period.append(day_period_str) # Night = 22:00-04:00 #if hour_series >= 22 and hour_series < 4: if 22 >= hour_series < 4: day_period_str = "Night" day_period.append(day_period_str)
However, when double-checking if the length of my day_period list is the same as that of my dataframe (df)… they differ and they shouldn’t. I can’t spot the mistake. How can I fix the code?
len(day_period) >21882 len(df) >25696
Here’s a preview of the data:
timestamp latitude longitude hour weekday 0 2021-06-09 08:12:18.000 57.728867 11.949463 8 Wednesday 1 2021-06-09 08:12:18.000 57.728954 11.949368 8 Wednesday 2 2021-06-09 08:12:18.587 57.728867 11.949463 8 Wednesday 3 2021-06-09 08:12:18.716 57.728954 11.949368 8 Wednesday 4 2021-06-09 08:12:33.000 57.728905 11.949309 8 Wednesday
My end goal is to then append this list to the dataframe.
Advertisement
Answer
After testing a bit, it seems like the issue is the 22-4 block and separating them fixes this.
Also, I changed the >=
to <=
.
Using this code, it works as expected:
day_period = [] for index,row in df.iterrows(): hour_series = row["hour"] # Night 1 = 00:00-04:00 #if hour_series <= 0 and hour_series < 4: if 0 <= hour_series < 4: day_period_str = "Night" day_period.append(day_period_str) # Morning = 04:00-10:00 #if hour_series <= 4 and hour_series < 10: elif 4 <= hour_series < 10: day_period_str = "Morning" day_period.append(day_period_str) # Day = 10:00-16:00 #if hour_series <= 10 and hour_series < 16: elif 10 <= hour_series < 16: day_period_str = "Day" day_period.append(day_period_str) # Evening = 16:00-22:00 #if hour_series <= 16 and hour_series < 22: elif 16 <= hour_series < 22: day_period_str = "Evening" day_period.append(day_period_str) # Night 2 = 22:00-24:00 #if hour_series <= 22 and hour_series < 24: elif 22 <= hour_series < 24: day_period_str = "Night" day_period.append(day_period_str) print(len(all_rows)) print(len(day_period)) # they should match now