I have a pandas data-frame of the form
JavaScript
x
13
13
1
date N
2
0 2022-06-14 15:00:00 54
3
1 2022-06-14 15:03:00 55
4
2 2022-06-14 15:09:00 56
5
3 2022-06-14 15:13:00 54
6
4 2022-06-14 15:19:00 56
7
8
2793 2022-07-03 16:09:00 1266
9
2794 2022-07-03 16:14:00 1249
10
2795 2022-07-03 16:19:00 1244
11
2796 2022-07-03 16:25:00 1241
12
2797 2022-07-03 16:29:00 1250
13
and I want to plot the last 7 days together with the average over the weekdays.
I can create / plot the average by using
JavaScript
1
9
1
# Round to 10 min and create weekday avg's
2
avg_week = df.groupby(
3
[
4
df["date"].dt.weekday,
5
10 * round( (df["date"].dt.minute + 60*df["date"].dt.hour) / 10, 0 )
6
] ).mean()
7
8
avg_week.plot(legend=True, ax=ax, color='blue')
9
JavaScript
1
14
14
1
N
2
date date
3
0 390.0 8.666667
4
400.0 12.250000
5
410.0 23.000000
6
420.0 20.666667
7
430.0 29.333333
8
9
6 1210.0 2.000000
10
1220.0 3.000000
11
1230.0 2.000000
12
1250.0 1.000000
13
1260.0 0.000000
14
and I can create / plot the last 7 days by using
JavaScript
1
3
1
# Plot last 7 days (actually 6 days + minutes since midnight to start at a fresh day)
2
df[df['date'] > df["date"].max() - pd.Timedelta(days=6, minutes=60*datetime.now().hour + datetime.now().minute) ].plot(x='date', y='N', ax=ax)
3
but I fail to combine them to a single plot since the average uses weekday and minutes after midnight (actually idx of avg_week
) instead of a datetime obj.
What else I worked out so far is setting the weekdays manually
JavaScript
1
8
1
ax.set_xticklabels( ax.get_xticklabels(),
2
rotation=45,
3
ha='right',
4
rotation_mode='anchor')
5
6
ax.set_xticks(ticks=[0] + [ sum( [ avg_week.N[i].size for i in range(n+1) ] ) for n in range(7) ][:-1],
7
labels = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'])
8
but the problem of course persists.
Any ideas? Thanks a lot!
Advertisement
Answer
Consider assigning the weekday average as a separate new column with groupby.transform
. Then plot both columns on same x-axis in a time series plot.
JavaScript
1
21
21
1
# ASSIGN COLUMNS
2
df = (
3
df.assign(
4
weekday = lambda x: x["date"].dt.weekday,
5
ten_minute_group = lambda x: x["date"].dt.floor('10min'),
6
weekday_average = lambda x: x.groupby(["weekday", "ten_minute_group"]).transform("mean")
7
)
8
)
9
10
# FILTER DATA
11
sub = df[df['date'] > df["date"].max() - pd.Timedelta(
12
days=6, minutes=60*datetime.now().hour + datetime.now().minute
13
)
14
15
# PLOT SUBSET OF COLUMNS
16
(
17
sub.set_index("date")
18
.reindex(["N", "weekday_average"], axis="columns")
19
.plot(kind="line", ax=ax)
20
)
21