Skip to content
Advertisement

Pandas: Group by calendar-week, then plot grouped barplots for the real datetime

EDIT

I found a quite nice solution and posted it below as an answer. The result will look like this:

enter image description here


Some example data you can generate for this problem:

JavaScript

resulting in:

JavaScript

I’d like to group by calendar-week and by value of col1. Like this:

JavaScript

resulting in:

JavaScript

Then I want a plot to be generated like this: enter image description here That means: calendar-week and year (datetime) on the x-axis and for each of the grouped col1 one bar.

The problem I’m facing is: I only have integers describing the calendar week (KW in the plot), but I somehow have to merge back the date on it to get the ticks labeled by year as well. Furthermore I can’t only plot the grouped calendar week because I need a correct order of the items (kw 47, kw 48 (year 2013) have to be on the left side of kw 1 (because this is 2014)).


EDIT

I figured out from here: http://pandas.pydata.org/pandas-docs/stable/visualization.html#visualization-barplot that grouped bars need to be columns instead of rows. So I thought about how to transform the data and found the method pivot which turns out to be a great function. reset_index is needed to transform the multiindex into columns. At the end I fill NaNs by zero:

JavaScript

transforms the data into:

JavaScript

which looks like the example data in the docs to be plotted in grouped bars:

JavaScript

gets this:

enter image description here

whereas I have the problem with the axis as it is now sorted (from 1-52), which is actually wrong, because calendar week 52 belongs to year 2013 in this case… Any ideas on how to merge back the real datetime for the calendar-weeks and use them as x-axis ticks?

Advertisement

Answer

Okay I answer the question myself as I finally figured it out. The key is to not group by calendar week (as you would loose information about the year) but rather group by a string containing calendar week and year.

Then change the layout (reshaping) as mentioned in the question already by using pivot. The date will be the index. Use reset_index() to make the current date-index a column and instead get a integer-range as an index (which is then in the correct order to be plotted (lowest-year/calendar week is index 0 and highest year/calendar week is the highest integer).

Select the date-column as a new variable ticks as a list and delete that column from the DataFrame. Now plot the bars and simply set the labels of the xticks to ticks. Completey solution is quite easy and here:

JavaScript

RESULT:

enter image description here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement