Skip to content
Advertisement

How to order categorical month variable when plotting using matplotlib?

I am doing some topic modelling, and I am interested in showing how the average topic weight changes over time. The problem arises when I plot it using matplotlib (version 3.3.4). On the x-axis I would like to have the categorical month_year variable. The problem is that it is not ordered in a sensible way. I have tried, as suggested in other stack overflow posts, to make sure that the dtype of the pandas column is an ordered categorical using the following code:

JavaScript
JavaScript

However, when I plot the average weight for the first 9 topics using the following code, the months are still all scrambled up.

JavaScript

enter image description here

Any ideas on how to solve this?

EDIT: The following can be used to create a test dataframe

JavaScript

Advertisement

Answer

A robust solution can be to convert month_year column from str type to datetime and let pandas sort values by itself, no need to use custom CategoricalDtype:

JavaScript

So you have a dataframe like this:

JavaScript

Then you can plot with:

JavaScript

where matplotlib.dates.MonthLocator and matplotlib.dates.DateFormatter let you customize the x axis tick labels as you wish.

Complete code

JavaScript

enter image description here

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement