I am trying to add trendline to bar plot which is plotted by plotly
Code:
import plotly.express as px fig = px.bar(count, x="date", y="count",trendline="ols") fig.update_layout( xaxis_title="Date", yaxis_title = "Count" ) fig.show()
Error:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-129-8b01de219d3c> in <module> ----> 1 fig = px.bar(count, x="date", y="count",trendline="ols") 2 3 fig.update_layout( 4 xaxis_title="Date", 5 yaxis_title = "Count" TypeError: bar() got an unexpected keyword argument 'trendline'
Here is the data
How can I add a trendline successfully to this plot?
Advertisement
Answer
px.bar
has no trendline
method. Since you’re trying trendline="ols"
I’m guessing you’d like to create a linear trendline. And looking at your data, a linear trendline might just not be the best description of your dataset:
So you’ll have to add a trendline yourself. You can still have your bar chart using go.Bar
, but maybe consider displaying the trendline as a line and not more bars.
A closer look into scikit or statsmodels should be well worth your while regarding non-linear trends. One simple approach is to estimate a log-linear trend after a recoding of your dataset. You’ll see that this ‘captures’ the exponential increase of your variable
better than a simple linear trend does:
But is that good enough? I’ll let that decision be up to you. And as I’ve said, you should take a closer look at the linked resources.
Code for plot 1:
from sklearn.linear_model import LinearRegression import plotly.graph_objects as go import pandas as pd import numpy as np import datetime # data df=pd.DataFrame({'date': {0: '12.10.2019', 1: '13.10.2019', 2: '14.10.2019', 3: '15.10.2019', 4: '16.10.2019', 5: '17.10.2019', 6: '18.10.2019', 7: '19.10.2019', 8: '20.10.2019', 9: '21.10.2019', 10: '22.10.2019', 11: '23.10.2019', 12: '24.10.2019', 13: '25.10.2019', 14: '26.10.2019', 15: '27.10.2019', 16: '28.10.2019', 17: '29.10.2019', 18: '30.10.2019', 19: '31.10.2019', 20: '01.11.2019', 21: '02.11.2019', 22: '03.11.2019', 23: '04.11.2019', 24: '05.11.2019', 25: '06.11.2019', 26: '07.11.2019', 27: '08.11.2019', 28: '09.11.2019', 29: '10.11.2019', 30: '11.11.2019', 31: '12.11.2019', 32: '13.11.2019', 33: '14.11.2019', 34: '15.11.2019', 35: '16.11.2019', 36: '17.11.2019', 37: '18.11.2019', 38: '19.11.2019', 39: '20.11.2019', 40: '21.11.2019', 41: '22.11.2019', 42: '23.11.2019', 43: '24.11.2019', 44: '25.11.2019', 45: '26.11.2019', 46: '27.11.2019', 47: '28.11.2019', 48: '29.11.2019', 49: '30.11.2019', 50: '01.12.2019', 51: '02.12.2019', 52: '03.12.2019', 53: '04.12.2019', 54: '05.12.2019', 55: '06.12.2019', 56: '07.12.2019', 57: '08.12.2019', 58: '09.12.2019', 59: '10.12.2019', 60: '11.12.2019', 61: '12.12.2019', 62: '13.12.2019', 63: '14.12.2019', 64: '15.12.2019', 65: '16.12.2019', 66: '17.12.2019', 67: '18.12.2019', 68: '19.12.2019', 69: '20.12.2019', 70: '21.12.2019', 71: '22.12.2019', 72: '23.12.2019', 73: '24.12.2019', 74: '25.12.2019', 75: '26.12.2019', 76: '27.12.2019', 77: '28.12.2019', 78: '29.12.2019', 79: '30.12.2019', 80: '31.12.2019', 81: '01.01.2020', 82: '02.01.2020', 83: '03.01.2020', 84: '04.01.2020', 85: '05.01.2020', 86: '06.01.2020', 87: '07.01.2020', 88: '08.01.2020', 89: '09.01.2020', 90: '10.01.2020', 91: '11.01.2020', 92: '12.01.2020', 93: '13.01.2020', 94: '14.01.2020', 95: '15.01.2020', 96: '16.01.2020', 97: '17.01.2020', 98: '18.01.2020', 99: '19.01.2020', 100: '20.01.2020', 101: '21.01.2020', 102: '22.01.2020', 103: '23.01.2020', 104: '24.01.2020', 105: '25.01.2020', 106: '26.01.2020', 107: '27.01.2020', 108: '28.01.2020', 109: '29.01.2020', 110: '30.01.2020', 111: '31.01.2020'}, 'count': {0: 19, 1: 12, 2: 13, 3: 18, 4: 13, 5: 19, 6: 15, 7: 14, 8: 12, 9: 6, 10: 15, 11: 15, 12: 12, 13: 17, 14: 13, 15: 14, 16: 11, 17: 11, 18: 11, 19: 9, 20: 14, 21: 15, 22: 11, 23: 13, 24: 14, 25: 14, 26: 16, 27: 16, 28: 17, 29: 13, 30: 14, 31: 14, 32: 12, 33: 6, 34: 14, 35: 12, 36: 16, 37: 15, 38: 19, 39: 18, 40: 17, 41: 17, 42: 17, 43: 17, 44: 19, 45: 15, 46: 20, 47: 21, 48: 19, 49: 18, 50: 22, 51: 21, 52: 21, 53: 18, 54: 21, 55: 23, 56: 22, 57: 17, 58: 25, 59: 28, 60: 24, 61: 26, 62: 23, 63: 23, 64: 22, 65: 26, 66: 25, 67: 24, 68: 24, 69: 24, 70: 24, 71: 27, 72: 26, 73: 28, 74: 28, 75: 29, 76: 34, 77: 31, 78: 38, 79: 37, 80: 34, 81: 45, 82: 43, 83: 44, 84: 49, 85: 47, 86: 54, 87: 49, 88: 57, 89: 62, 90: 65, 91: 55, 92: 67, 93: 69, 94: 72, 95: 45, 96: 89, 97: 87, 98: 90, 99: 121, 100: 140, 101: 173, 102: 163, 103: 171, 104: 183, 105: 165, 106: 189, 107: 201, 108: 230, 109: 290, 110: 311, 111: 321}}) Y=df['count'] X=df.index # regression reg = LinearRegression().fit(np.vstack(X), Y) df['bestfit'] = reg.predict(np.vstack(X)) # plotly figure setup fig=go.Figure() fig.add_trace(go.Bar(name='X vs Y', x=X, y=Y.values)) fig.add_trace(go.Scatter(name='line of best fit', x=X, y=df['bestfit'], mode='lines')) # plotly figure layout fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y') fig.show()
Code for plot 2:
from sklearn.linear_model import LinearRegression import plotly.graph_objects as go import pandas as pd import numpy as np import datetime # data df=pd.DataFrame({'date': {0: '12.10.2019', 1: '13.10.2019', 2: '14.10.2019', 3: '15.10.2019', 4: '16.10.2019', 5: '17.10.2019', 6: '18.10.2019', 7: '19.10.2019', 8: '20.10.2019', 9: '21.10.2019', 10: '22.10.2019', 11: '23.10.2019', 12: '24.10.2019', 13: '25.10.2019', 14: '26.10.2019', 15: '27.10.2019', 16: '28.10.2019', 17: '29.10.2019', 18: '30.10.2019', 19: '31.10.2019', 20: '01.11.2019', 21: '02.11.2019', 22: '03.11.2019', 23: '04.11.2019', 24: '05.11.2019', 25: '06.11.2019', 26: '07.11.2019', 27: '08.11.2019', 28: '09.11.2019', 29: '10.11.2019', 30: '11.11.2019', 31: '12.11.2019', 32: '13.11.2019', 33: '14.11.2019', 34: '15.11.2019', 35: '16.11.2019', 36: '17.11.2019', 37: '18.11.2019', 38: '19.11.2019', 39: '20.11.2019', 40: '21.11.2019', 41: '22.11.2019', 42: '23.11.2019', 43: '24.11.2019', 44: '25.11.2019', 45: '26.11.2019', 46: '27.11.2019', 47: '28.11.2019', 48: '29.11.2019', 49: '30.11.2019', 50: '01.12.2019', 51: '02.12.2019', 52: '03.12.2019', 53: '04.12.2019', 54: '05.12.2019', 55: '06.12.2019', 56: '07.12.2019', 57: '08.12.2019', 58: '09.12.2019', 59: '10.12.2019', 60: '11.12.2019', 61: '12.12.2019', 62: '13.12.2019', 63: '14.12.2019', 64: '15.12.2019', 65: '16.12.2019', 66: '17.12.2019', 67: '18.12.2019', 68: '19.12.2019', 69: '20.12.2019', 70: '21.12.2019', 71: '22.12.2019', 72: '23.12.2019', 73: '24.12.2019', 74: '25.12.2019', 75: '26.12.2019', 76: '27.12.2019', 77: '28.12.2019', 78: '29.12.2019', 79: '30.12.2019', 80: '31.12.2019', 81: '01.01.2020', 82: '02.01.2020', 83: '03.01.2020', 84: '04.01.2020', 85: '05.01.2020', 86: '06.01.2020', 87: '07.01.2020', 88: '08.01.2020', 89: '09.01.2020', 90: '10.01.2020', 91: '11.01.2020', 92: '12.01.2020', 93: '13.01.2020', 94: '14.01.2020', 95: '15.01.2020', 96: '16.01.2020', 97: '17.01.2020', 98: '18.01.2020', 99: '19.01.2020', 100: '20.01.2020', 101: '21.01.2020', 102: '22.01.2020', 103: '23.01.2020', 104: '24.01.2020', 105: '25.01.2020', 106: '26.01.2020', 107: '27.01.2020', 108: '28.01.2020', 109: '29.01.2020', 110: '30.01.2020', 111: '31.01.2020'}, 'count': {0: 19, 1: 12, 2: 13, 3: 18, 4: 13, 5: 19, 6: 15, 7: 14, 8: 12, 9: 6, 10: 15, 11: 15, 12: 12, 13: 17, 14: 13, 15: 14, 16: 11, 17: 11, 18: 11, 19: 9, 20: 14, 21: 15, 22: 11, 23: 13, 24: 14, 25: 14, 26: 16, 27: 16, 28: 17, 29: 13, 30: 14, 31: 14, 32: 12, 33: 6, 34: 14, 35: 12, 36: 16, 37: 15, 38: 19, 39: 18, 40: 17, 41: 17, 42: 17, 43: 17, 44: 19, 45: 15, 46: 20, 47: 21, 48: 19, 49: 18, 50: 22, 51: 21, 52: 21, 53: 18, 54: 21, 55: 23, 56: 22, 57: 17, 58: 25, 59: 28, 60: 24, 61: 26, 62: 23, 63: 23, 64: 22, 65: 26, 66: 25, 67: 24, 68: 24, 69: 24, 70: 24, 71: 27, 72: 26, 73: 28, 74: 28, 75: 29, 76: 34, 77: 31, 78: 38, 79: 37, 80: 34, 81: 45, 82: 43, 83: 44, 84: 49, 85: 47, 86: 54, 87: 49, 88: 57, 89: 62, 90: 65, 91: 55, 92: 67, 93: 69, 94: 72, 95: 45, 96: 89, 97: 87, 98: 90, 99: 121, 100: 140, 101: 173, 102: 163, 103: 171, 104: 183, 105: 165, 106: 189, 107: 201, 108: 230, 109: 290, 110: 311, 111: 321}}) Y=np.log(df['count']) X=df.index # log regression df_log=pd.DataFrame({'X':df.index, 'Y': np.log(df['count'])}) df_log.set_index('X', inplace = True) reg = LinearRegression().fit(np.vstack(df_log.index), df_log['Y']) df_log['bestfit'] = reg.predict(np.vstack(df_log.index)) df_new=pd.DataFrame({'X':df.index, 'Y':np.exp(df['count']), 'trend':np.exp(df_log['bestfit'])}) df_new.set_index('X', inplace=True) # plotly figure setup fig=go.Figure() fig.add_trace(go.Bar(name='X vs Y', x=df_new.index, y=df['count'])) fig.add_trace(go.Scatter(name='line of best fit', x=df_new.index, y=df_new['trend'], mode='lines')) # plotly figure layout fig.update_layout(xaxis_title = 'X', yaxis_title = 'Y') fig.show()