I am trying to model an equation that depends on T
and parameters xi
, mu
, sig
.
I have inferred parameters and spread(standard deviation) of those parameters for different durations (1h, 3h, etc). In the example code the parameters are for 1h duration.
I need to create a forloop to create a cloud of zp with the array of xi, mu and sig. The different values T can take are [2, 5, 25, 50, 75, 100]
I also want to show error bars or uncertainty with the standard deviation in line 2. I used Metropolis Hastings Algorithm for exploring the parametric space with 15000 iterations in 3 chains
xi = accepted[0:] xi = array([[-2.00000000e-01, 1.00000000e-01, 1.00000000e-01], [-2.06044711e-01, 1.51739593e-01, 1.36675069e-01], [-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], ..., [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [-3.14226453e-02, 1.44844410e+01, 5.00637147e+00]]) mu = accepted[1:] mu = array([[-2.06044711e-01, 1.51739593e-01, 1.36675069e-01], [-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], [-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], ..., [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [-3.14226453e-02, 1.44844410e+01, 5.00637147e+00]]) sig = accepted [2:] sig = array([[-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], [-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], [-2.91747418e-01, 1.10818827e-01, 1.80040639e-01], ..., [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [ 1.45611857e-02, 1.46824099e+01, 5.16110127e+00], [-3.14226453e-02, 1.44844410e+01, 5.00637147e+00]]) spread = accepted[:,0].std(), accepted[:,1].std(), accepted[:,2].std() (xi, mu, sig) def zp(T, xi = accepted[0:], mu = accepted[1:], sig= accepted[2:]): p = 1/T yp = - np.log10(1 - p) zp = np.ndarray(shape=(xi.size, T.size)) for i in range(xi.size): if xi[i] == 0: zp[i,:] = mu[i] - (sig[i]*(np.log10(yp))) else: zp[i,:] = mu[i] - ((sig[i]/xi[i])*(1-(yp**(-xi[i])))) return zp # get results res = zp(T, xi, mu, sig)
Advertisement
Answer
So, you have the (15000,3)
matrix accepted
, where xi=accepted[:,0]
, mu=accepted[:,1]
and sig=accepted[:,2]
.
I will generate some sample data for xi
, mu
and sig
, just to show you the results of plotting.
import numpy as np import pandas as pd import matplotlib.pyplot as plt # I will generate some sample data # You'll have to use your own data n = 15000 np.random.seed(1) xi, mu, sig = ( np.random.normal(loc=-0.15153068743678966, scale=0.2254333661580348, size=(n)), np.random.normal(loc=14.241861263759796, scale=2.6116567608814196, size=(n)), np.random.normal(loc=5.5779131542307345, scale=0.9627764065564182, size=(n)), )
You defined T
as
# define T steps T = np.array([2, 5, 25, 50, 75, 100])
Now we take mean and standard deviation of parameters
xi_mean = xi.mean() mu_mean = mu.mean() sig_mean = sig.mean() xi_std = xi.std() mu_std = mu.std() sig_std = sig.std()
and define a function zp
# function zp def zp(T, xi, mu, sig): p = 1 / T yp = - np.log10(1 - p) # ravel results _xi = xi.ravel() _mu = mu.ravel() _sig = sig.ravel() res = np.ndarray(shape=(_xi.size, T.size)) for i in range(_xi.size): if _xi[i] == 0: res[i,:] = _mu[i] - (_sig[i]*(np.log10(yp))) else: res[i,:] = _mu[i] - ((_sig[i]/_xi[i])*(1-(yp**(-_xi[i])))) return res # get results res = zp(T, xi, mu, sig)
We can define a DataFrame with all results
# define results DataFrame df = pd.DataFrame(res, columns=T) print(df) 2 5 25 50 75 100 0 24.610952 34.489626 54.614356 65.349657 72.376143 77.735341 1 16.554362 20.033999 23.514591 24.524273 25.023313 25.342476 2 23.468215 28.276272 33.212243 34.678420 35.410825 35.882346 3 23.102447 26.089680 28.680803 29.339580 29.646593 29.835899 4 21.021596 30.494043 45.594905 52.182941 56.105955 58.925041 ... ... ... ... ... ... ... 14995 22.964737 27.856439 33.039263 34.623438 35.425031 35.945247 14996 21.371429 29.078696 47.122467 57.868230 65.281555 71.127181 14997 18.534785 21.512996 24.424363 25.251344 25.656252 25.913699 14998 19.915343 28.939309 43.440076 49.808611 53.612702 56.351668 14999 20.835338 25.069838 29.829853 31.364291 32.159584 32.683499 [15000 rows x 6 columns]
Now we compute zp
with mean +/- std of parameters
zp_mean = zp(T, xi_mean, mu_mean, sig_mean).ravel() zp_lo = zp(T, xi_mean-xi_std, mu_mean-mu_std, sig_mean-sig_std).ravel() zp_hi = zp(T, xi_mean+xi_std, mu_mean+mu_std, sig_mean+sig_std).ravel()
and we can finally plot the 15000 lines and mean+/-std
fig, ax = plt.subplots(figsize=(12, 5)) ax.errorbar( T, zp_mean, yerr=[zp_mean-zp_lo, zp_hi-zp_mean], color='k', zorder=999, label='mean and std' ) for i, col in enumerate(df.T.columns): _df = df.T[col] ax.plot(_df, lw=1, alpha=.01, color='r') ax.set( xlabel='duration', ylabel='value', # adjust this ylim=(10, 150) ) ax.legend() plt.show()
You could also choose a faster solution with seaborn
.
First, melt the DataFrame
melt_df = df.melt(var_name='duration') print(melt_df) duration value 0 2 24.610952 1 2 16.554362 2 2 23.468215 3 2 23.102447 4 2 21.021596 ... ... ... 89995 100 35.945247 89996 100 71.127181 89997 100 25.913699 89998 100 56.351668 89999 100 32.683499 [90000 rows x 2 columns]
then plot with relplot
and choose the wanted confidence interval (here is 99%, could be also 'sd'
)
import seaborn as sns g = sns.relplot( kind='line', data=melt_df, x='duration', y='value', ci=99 ) g.axes.flat[0].set_title('Confidence Interval 99%') plt.show()