I have a pandas dataframe df:
times = pd.date_range(start="2018-09-09",end="2020-02-02") values = np.random.rand(512) # Make df df = pd.DataFrame({'Time' : times, 'Value': values})
And I can plot this easily using plt.plot:
But now I want to add a trendline. I tried using some answers:
How can I draw scatter trend line on matplot? Python-Pandas
Which doesn’t work:
TypeError: unsupported operand type(s) for +: 'datetime.datetime' and 'float'
Then I found the following question and answer:
TypeError: ufunc subtract cannot use operands with types dtype(‘<M8[ns]’) and dtype(‘float64’)
But these don’t work as well. There my understanding of the issue stops, and I can’t find anything else.
My code so far:
# Get values for the trend line analysis x = df['Time'].dt.to_pydatetime() # Calculate a fit line trend = np.polyfit(x, df['Value'], 1) fit = np.poly1d(trend) # General plot again figure(figsize=(12, 8)) plt.plot(x, df['Value']) plt.xlabel('Date') plt.ylabel('Value') # Now trendline plt.plot(x, fit(x), "r--") # And show plt.show()
Advertisement
Answer
One approach is to convert the dates using matplotlib’s date2num() function and its counterpart the num2date function:
import matplotlib.pyplot as plt import pandas as pd import numpy as np import matplotlib.dates as dates np.random.seed(123) times = pd.date_range(start="2018-09-09",end="2020-02-02") values = np.random.rand(512) df = pd.DataFrame({'Time' : times, 'Value': values}) # Get values for the trend line analysis x_dates = df['Time'] x_num = dates.date2num(x_dates) # Calculate a fit line trend = np.polyfit(x_num, df['Value'], 1) fit = np.poly1d(trend) # General plot again #figure(figsize=(12, 8)) plt.plot(x_dates, df['Value']) plt.xlabel('Date') plt.ylabel('Value') # Not really necessary to convert the values back into dates #but added as a demonstration in case one wants to plot non-linear curves x_fit = np.linspace(x_num.min(), x_num.max()) plt.plot(dates.num2date(x_fit), fit(x_fit), "r--") # And show plt.show()