I have this very simple problem, but somehow I have not found a solution for it yet:
I have two curves, A1 = [1,2,3] A2 = [4,5,6]
I want to fit those curves to another curve B1 = [4,5,3] with Linear Regression so B1 = aA1 + bA2
This can easily be done with sklearn LinearRegression – but sklearn does not give you the standard deviation on your fitting parameters.
I tried using statsmodels… but somehow i cant get the format right
JavaScript
x
9
1
import numpy as np
2
3
import statsmodels.api as sm
4
5
a = np.array([[1, 2, 3], [4, 5, 6]])
6
b = np.array([4, 5, 3])
7
8
ols = sm.OLS(a, b)
9
Error : ValueError: endog and exog matrices are different sizes
Advertisement
Answer
If your formula is B1 = aA1 + bA2
, then the array b is your endogenous and the array a is your exogenous. You need to transpose your exogenous:
JavaScript
1
27
27
1
ols = sm.OLS(b, a.T)
2
res = ols.fit()
3
res.summary()
4
5
OLS Regression Results
6
==============================================================================
7
Dep. Variable: y R-squared: 0.250
8
Model: OLS Adj. R-squared: -0.500
9
Method: Least Squares F-statistic: 0.3333
10
Date: Sat, 14 Aug 2021 Prob (F-statistic): 0.667
11
Time: 05:48:08 Log-Likelihood: -3.2171
12
No. Observations: 3 AIC: 10.43
13
Df Residuals: 1 BIC: 8.631
14
Df Model: 1
15
Covariance Type: nonrobust
16
==============================================================================
17
coef std err t P>|t| [0.025 0.975]
18
------------------------------------------------------------------------------
19
x1 -2.1667 1.462 -1.481 0.378 -20.749 16.416
20
x2 1.6667 0.624 2.673 0.228 -6.257 9.590
21
==============================================================================
22
Omnibus: nan Durbin-Watson: 3.000
23
Prob(Omnibus): nan Jarque-Bera (JB): 0.531
24
Skew: 0.707 Prob(JB): 0.767
25
Kurtosis: 1.500 Cond. No. 12.3
26
==============================================================================
27
From sklearn:
JavaScript
1
5
1
from sklearn.linear_model import LinearRegression
2
reg = LinearRegression(fit_intercept=False).fit(a.T,b)
3
reg.coef_
4
array([-2.16666667, 1.66666667])
5