I have this very simple problem, but somehow I have not found a solution for it yet:
I have two curves, A1 = [1,2,3] A2 = [4,5,6]
I want to fit those curves to another curve B1 = [4,5,3] with Linear Regression so B1 = aA1 + bA2
This can easily be done with sklearn LinearRegression – but sklearn does not give you the standard deviation on your fitting parameters.
I tried using statsmodels… but somehow i cant get the format right
import numpy as np import statsmodels.api as sm a = np.array([[1, 2, 3], [4, 5, 6]]) b = np.array([4, 5, 3]) ols = sm.OLS(a, b)
Error : ValueError: endog and exog matrices are different sizes
Advertisement
Answer
If your formula is B1 = aA1 + bA2
, then the array b is your endogenous and the array a is your exogenous. You need to transpose your exogenous:
ols = sm.OLS(b, a.T) res = ols.fit() res.summary() OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.250 Model: OLS Adj. R-squared: -0.500 Method: Least Squares F-statistic: 0.3333 Date: Sat, 14 Aug 2021 Prob (F-statistic): 0.667 Time: 05:48:08 Log-Likelihood: -3.2171 No. Observations: 3 AIC: 10.43 Df Residuals: 1 BIC: 8.631 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ x1 -2.1667 1.462 -1.481 0.378 -20.749 16.416 x2 1.6667 0.624 2.673 0.228 -6.257 9.590 ============================================================================== Omnibus: nan Durbin-Watson: 3.000 Prob(Omnibus): nan Jarque-Bera (JB): 0.531 Skew: 0.707 Prob(JB): 0.767 Kurtosis: 1.500 Cond. No. 12.3 ==============================================================================
From sklearn:
from sklearn.linear_model import LinearRegression reg = LinearRegression(fit_intercept=False).fit(a.T,b) reg.coef_ array([-2.16666667, 1.66666667])