I would like to name the period of the day based on hourly information to my dataframe.
For this, I am attempting the following:
day_period = []
for index,row in df.iterrows():
hour_series = row["hour"]
# Morning = 04:00-10:00
#if hour_series >= 4 and hour_series < 10:
if 4 >= hour_series < 10:
day_period_str = "Morning"
day_period.append(day_period_str)
# Day = 10:00-16:00
#if hour_series >= 10 and hour_series < 16:
if 10 >= hour_series < 16:
day_period_str = "Day"
day_period.append(day_period_str)
# Evening = 16:00-22:00
#if hour_series >= 16 and hour_series < 22:
if 16 >= hour_series < 22:
day_period_str = "Evening"
day_period.append(day_period_str)
# Night = 22:00-04:00
#if hour_series >= 22 and hour_series < 4:
if 22 >= hour_series < 4:
day_period_str = "Night"
day_period.append(day_period_str)
However, when double-checking if the length of my day_period list is the same as that of my dataframe (df)… they differ and they shouldn’t. I can’t spot the mistake. How can I fix the code?
len(day_period) >21882 len(df) >25696
Here’s a preview of the data:
timestamp latitude longitude hour weekday 0 2021-06-09 08:12:18.000 57.728867 11.949463 8 Wednesday 1 2021-06-09 08:12:18.000 57.728954 11.949368 8 Wednesday 2 2021-06-09 08:12:18.587 57.728867 11.949463 8 Wednesday 3 2021-06-09 08:12:18.716 57.728954 11.949368 8 Wednesday 4 2021-06-09 08:12:33.000 57.728905 11.949309 8 Wednesday
My end goal is to then append this list to the dataframe.
Advertisement
Answer
After testing a bit, it seems like the issue is the 22-4 block and separating them fixes this.
Also, I changed the >= to <=.
Using this code, it works as expected:
day_period = []
for index,row in df.iterrows():
hour_series = row["hour"]
# Night 1 = 00:00-04:00
#if hour_series <= 0 and hour_series < 4:
if 0 <= hour_series < 4:
day_period_str = "Night"
day_period.append(day_period_str)
# Morning = 04:00-10:00
#if hour_series <= 4 and hour_series < 10:
elif 4 <= hour_series < 10:
day_period_str = "Morning"
day_period.append(day_period_str)
# Day = 10:00-16:00
#if hour_series <= 10 and hour_series < 16:
elif 10 <= hour_series < 16:
day_period_str = "Day"
day_period.append(day_period_str)
# Evening = 16:00-22:00
#if hour_series <= 16 and hour_series < 22:
elif 16 <= hour_series < 22:
day_period_str = "Evening"
day_period.append(day_period_str)
# Night 2 = 22:00-24:00
#if hour_series <= 22 and hour_series < 24:
elif 22 <= hour_series < 24:
day_period_str = "Night"
day_period.append(day_period_str)
print(len(all_rows))
print(len(day_period)) # they should match now