Code:
from sklearn.linear_model import LogisticRegression l = LogisticRegression() b = l.fit(XT,Y) print "coeff ",b.coef_ print "intercept ",b.intercept_
Here’s the dataset
XT = [[23] [24] [26] [21] [29] [31] [27] [24] [22] [23]] Y = [1 0 1 0 0 1 1 0 1 0]
Result:
coeff [[ 0.00850441]] intercept [-0.15184511
Now I added the same data in spss.Analyse->Regression->Binary Logistic Regression. I set the corresponding Y -> dependent and XT -> Covariates. The results weren’t even close. Am I missing something in python or SPSS? Python-Sklearn
Advertisement
Answer
SPSS Logistic regression does not include parameter regularisation in it’s cost function, it just does ‘raw’ logistic regression. In regularisation, the cost function includes a regularisation expression to prevent overfitting. You specify the inverse of this with the C value. If you set C to a very high value, it will closely mimic SPSS, so there is no magic number – just set it as high as you can, and there will be no regularisation.