Skip to content
Advertisement

Wrong labels when plotting a time series pandas dataframe with matplotlib

I am working with a dataframe containing data of 1 week.

                          y
ds                             
2017-08-31 10:15:00    1.000000
2017-08-31 10:20:00    1.049107
2017-08-31 10:25:00    1.098214
...
2017-09-07 10:05:00   99.901786
2017-09-07 10:10:00   99.950893
2017-09-07 10:15:00  100.000000

I create a new index by combining the weekday and time i.e.

                y
dayIndex             
4 - 10:15    1.000000
4 - 10:20    1.049107
4 - 10:25    1.098214
...
4 - 10:05   99.901786
4 - 10:10   99.950893
4 - 10:15  100.000000

The plot of this data is the following: Weekly data The plot is correct as the labels reflect the data in the dataframe. However, when zooming in, the labels do not seem correct as they no longer correspond to their original values: Wrong labels when zooming What is causing this behavior?

Here is the code to reproduce this:

import datetime
import numpy as np
import pandas as pd

dtnow = datetime.datetime.now()
dindex = pd.date_range(dtnow , dtnow  + datetime.timedelta(7), freq='5T')
data = np.linspace(1,100, num=len(dindex))
df = pd.DataFrame({'ds': dindex, 'y': data})
df = df.set_index('ds')
df = df.resample('5T').mean()
df['dayIndex'] = df.index.strftime('%w - %H:%M')
df= df.set_index('dayIndex')
df.plot()

Advertisement

Answer

“What is causing this behavior?”

The formatter of an axes of a pandas dates plot is a matplotlib.ticker.FixedFormatter (see e.g. print plt.gca().xaxis.get_major_formatter()). “Fixed” means that it formats the ith tick (if shown) with some constant string.

When zooming or panning, you shift the tick locations, but not the format strings.
In short: A pandas date plot may not be the best choice for interactive plots.

Solution

A solution is usually to use matplotlib formatters directly. This requires the dates to be datetime objects (which can be ensured using df.index.to_pydatetime()).

import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates

dtnow = datetime.datetime.now()
dindex = pd.date_range(dtnow , dtnow  + datetime.timedelta(7), freq='110T')
data = np.linspace(1,100, num=len(dindex))
df = pd.DataFrame({'ds': dindex, 'y': data})
df = df.set_index('ds')
df.index.to_pydatetime()
df.plot(marker="o")


plt.gca().xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%w - %H:%M'))
plt.show()

enter image description here

Advertisement