Code:
JavaScript
x
6
1
from sklearn.linear_model import LogisticRegression
2
l = LogisticRegression()
3
b = l.fit(XT,Y)
4
print "coeff ",b.coef_
5
print "intercept ",b.intercept_
6
Here’s the dataset
JavaScript
1
13
13
1
XT =
2
[[23]
3
[24]
4
[26]
5
[21]
6
[29]
7
[31]
8
[27]
9
[24]
10
[22]
11
[23]]
12
Y = [1 0 1 0 0 1 1 0 1 0]
13
Result:
JavaScript
1
3
1
coeff [[ 0.00850441]]
2
intercept [-0.15184511
3
Now I added the same data in spss.Analyse->Regression->Binary Logistic Regression. I set the corresponding Y -> dependent and XT -> Covariates. The results weren’t even close. Am I missing something in python or SPSS?
Python-Sklearn
Advertisement
Answer
SPSS Logistic regression does not include parameter regularisation in it’s cost function, it just does ‘raw’ logistic regression. In regularisation, the cost function includes a regularisation expression to prevent overfitting. You specify the inverse of this with the C value. If you set C to a very high value, it will closely mimic SPSS, so there is no magic number – just set it as high as you can, and there will be no regularisation.