Skip to content
Advertisement

How can I format multiple thousands of date-dependant data points in a readable graph with Matplotlib?

I have two corresponding lists, one of the dates and another of its respective price. The lists are 30,000 data points long.

x = [datetime.date(1997, 8, 8), datetime.date(2021, 8, 17), datetime.date(2019, 8, 7), ... ]

y = [0.49, 1.99, 0.0, ...]

I’m using this code to generate the graph below:

plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%%Y/%%m/%%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=5))
plt.scatter(x,y)
plt.gcf().autofmt_xdate()
plt.show()

enter image description here

I’m supposed to have the dates separated in a readable format using the code above but it doesn’t seem to work. I’m also not sure if it’s possible to make the graph more readable or if that’s the nature of a scatter with so many data points (not sure what else to use).

The dates range from 1997 to 2022 and the prices from 0.0 (free) to 300.

Also, I’m getting this error that constantly pops up even while the graph is showing and I have no idea what it’s doing:

Locator attempting to generate 1995 ticks ([9591.0, …, 19561.0]), which exceeds Locator.MAXTICKS (1000).

Advertisement

Answer

If you want your graph to be more readable, you need to increase the interval of DayLocator object to at least 6000. Also, you can change the size of the scatter points with the s parameter at the scatter function:

import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime
import matplotlib.dates as mdates

x = pd.date_range(datetime.today(), periods=30000).tolist()
y = list(range(30000))

plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=6000))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%%d-%%m-%%Y'))
plt.gcf().autofmt_xdate()
plt.scatter(x, y, s=0.1)

Output:

enter image description here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement