Skip to content
Advertisement

Scikit-learn: Confused between coefficient of X0 and intercept

I have an extra column in my train/test set for feature/X which is just 1, this is supposed to be the coefficient for Xo, which is never in the dataset. It is mentioned to be θo in the equation;

$$Y=θo + θ1X1$$

Now coming to the intercept, as a model parameter, I always knew this to be θ0. So I am a little confused as to which would be the notation for the first coefficient. I am confused between the 0 in

[0. 144.345] and [55547.458] from the code:

    #model coefficient
    print(lin_reg.coef_)
>>>[[0.    144.345]]
    #model intercept
    print(lin_reg.intercept_)
>>>[55547.458]

Thanks in advance

Advertisement

Answer

.coef_ returns the parameter values/ value of weights of the model, and the number of weights is equal to the number of features you’ve in your dataset. From the output you’ve provided in your answer it seems to be that you must have 2 features in your dataset, thus you got 2 weights in your model, one with value 0. and the other with 144.345.

On the other hand .intercept_ returns the Y intercept/bias value of your model. Which you’re referring to as θo. Which have the value of 55547.458.

Example Code:

import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

reg = LinearRegression().fit(X, y)

print(reg.coef_)
# Output: [1. 2.]
print(reg.intercept_)
# Output: 3.00

# y = θo + θ1 * x_0 + θ2 * x_1 => 3.00 + 1 * x_0 + 2 * x_1
Advertisement