Skip to content
Advertisement

Plotly: How to change the format of the values for the x axis?

I need to create a graph from data with python.

I took my inspiration from various website and I’ve made this script :

import plotly.express as px
import plotly.graph_objs as go
import statsmodels.api as sm

value = [1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 8]
date = [ 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]

fig = px.scatter(x=date, y=value )
fig.add_trace(go.Scatter(x=date, y=value, mode='lines',name='MB Used' ))

trend = sm.OLS(value,sm.add_constant(date)).fit().fittedvalues

fig.add_traces(go.Scatter(x=date, y=trend,mode = 'lines', name='trendline'))
fig

This script allow to generate this graph : enter image description here

For the x axe, I would like to display the value like that 2020-01-01-06:00 but when I change my list like that :

date = [ 2020-01-01-06:00, 2020-01-01-12:00, 2020-01-01-18:00, 2020-01-02-06:00, 2020-01-02-12:00, 2020-01-02-18:00, 2020-01-03-06:00, 2020-01-03-12:00, 2020-01-03-18:00, 2020-01-04-06:00, 2020-01-04-12:00 ]

The error is :

File "<ipython-input-13-4958920545c3>", line 6
    date = [ 2020-01-01-06:00, 2020-01-01-12:00, 2020-01-01-18:00, 2020-01-02-06:00, 2020-01-02-12:00, 2020-01-02-18:00, 2020-01-03-06:00, 2020-01-03-12:00, 2020-01-03-18:00, 2020-01-04-06:00, 2020-01-04-12:00 ]
                   ^
SyntaxError: invalid token

If I try that :

date = [ '2020-01-01-06:00', '2020-01-01-12:00', '2020-01-01-18:00', '2020-01-02-06:00', '2020-01-02-12:00', '2020-01-02-18:00', '2020-01-03-06:00', '2020-01-03-12:00', '2020-01-03-18:00', '2020-01-04-06:00', '2020-01-04-12:00' ]

The error is :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-e06e438ca2eb> in <module>
     10 fig.add_trace(go.Scatter(x=date, y=value, mode='lines',name='MB Used' ))
     11 
---> 12 trend = sm.OLS(value,sm.add_constant(date)).fit().fittedvalues
     13 
     14 fig.add_traces(go.Scatter(x=date, y=trend,mode = 'lines', name='trendline'))

~/.local/lib/python3.6/site-packages/statsmodels/tools/tools.py in add_constant(data, prepend, has_constant)
    303         raise ValueError('Only implementd 2-dimensional arrays')
    304 
--> 305     is_nonzero_const = np.ptp(x, axis=0) == 0
    306     is_nonzero_const &= np.all(x != 0.0, axis=0)
    307     if is_nonzero_const.any():

<__array_function__ internals> in ptp(*args, **kwargs)

~/.local/lib/python3.6/site-packages/numpy/core/fromnumeric.py in ptp(a, axis, out, keepdims)
   2541         else:
   2542             return ptp(axis=axis, out=out, **kwargs)
-> 2543     return _methods._ptp(a, axis=axis, out=out, **kwargs)
   2544 
   2545 

~/.local/lib/python3.6/site-packages/numpy/core/_methods.py in _ptp(a, axis, out, keepdims)
    228 def _ptp(a, axis=None, out=None, keepdims=False):
    229     return um.subtract(
--> 230         umr_maximum(a, axis, None, out, keepdims),
    231         umr_minimum(a, axis, None, None, keepdims),
    232         out

TypeError: cannot perform reduce with flexible type

Please, could you show me how to change that ?

Advertisement

Answer

The answer:

In the following code snippet I’ve replaced your dates with floats following this approach to serialize timestamps. This way you can use your dates both as input to sm.OLS and as one of a few more steps to get your dates displayed in the figure with your desired format.

The plot:

enter image description here

The details:

There are several reasons why you are not getting your desired result in your provided code snippet. First of all, none of the attempts of constuctring lists of date and time values are easily recognizable by the functions you are applying here. In date = [ '2020-01-01-06:00', '2020-01-01-12:00',...] you should remove one of the hyphens to get ['2020-01-01 06:00', '2020-01-01 12:00'...] instead. But even with a more widely recognizable list of timestamps, statsmodels will to my knowledge not accept those in sm.OLS(). And in the end, applying sensible labels to non-standard x-axis tickmarks can be (one of very few) real challenges in plotly.

Please not that the irregegular appearances of gridlines reflect the structure of your data. You’re missing observations for timestamps that end with 00-00-00 to represent a 24 hour cycle.

The code:

# imports
import plotly.express as px
import plotly.graph_objs as go
import statsmodels.api as sm
import datetime as dt

# data
value = [1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 8]
date = [ 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
date_h = ['2020-01-01 06:00', '2020-01-01 12:00', '2020-01-01 18:00', '2020-01-02 06:00', '2020-01-02 12:00', '2020-01-02 18:00', '2020-01-03 06:00', '2020-01-03 12:00', '2020-01-03 18:00', '2020-01-04 06:00', '2020-01-04 12:00' ]

# organize data in a pandas dataframe
df = pd.DataFrame({'value':value,
                   'date':date,
                    'date_h':pd.to_datetime(date_h)})

# function to serilaize irregular timestmps
def serial_date(date1):
    temp = dt.datetime(1899, 12, 30)    # Note, not 31st Dec but 30th!
    delta = date1 - temp
    return float(delta.days) + (float(delta.seconds) / 86400)

df['date_s'] = [serial_date(d) for d in df['date_h']]

# set up base figure
fig = px.scatter(x=df['date_s'], y=df['value'] )
fig.add_trace(go.Scatter(x=df['date_s'], y=df['value'], mode='lines',name='MB Used' ))

# setup for linear regression using sm.OLS
Y=df['value']
independent=['date_s']
X=df[independent]
X=sm.add_constant(X)

# estimate trend
trend = sm.OLS(Y,X).fit().fittedvalues

# add trendline to figure
fig.add_traces(go.Scatter(x=df['date_s'], y=trend,mode = 'lines', name='trendline'))

# specify tick0, tickvals and ticktext to achiece desired x-axis format
fig.update_layout(yaxis=dict(title=''),
                  xaxis=dict(title='',
                  tick0= df['date_s'].iloc[0],
                  tickvals= df['date_s'],
                  ticktext = df['date_h'])
                 )

fig.show()
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement