Skip to content
Advertisement

How do you get the adjusted R-squared for the test data in statsModels?

I have a dataset like

import pandas as pd
import statsmodels.formula.api as smf
import statsmodels.api as sm
data = pd.DataFrame({'a':[4,3,4,6,6,3,2], 'b':[12,14,11,15,14,15,10]}
test = data.iloc[:4]
train = data.iloc[4:]

and I built the linear model for the train data

model = smf.ols("a ~ b", data = data)
print(model.fit().summary())

Now what I want to do is get the adjusted R^2 value based on the test data. Is there a simple command for this? I’ve been trying to build it from scratch and keep getting an error.

What I’ve been trying:

model.predict(test.b)

but it complains about the shape. Based on this: https://www.statsmodels.org/stable/examples/notebooks/generated/predict.html

I tried the following

X = sm.add_constant(test.b)
model.predict(X)

Now the error is

ValueError: shapes (200,2) and (200,2) not aligned: 2 (dim 1) != 200 (dim 0)

The shape matches but then there’s this thing I don’t understand about the “dim”. But I thought I matched as well as I could the example in the link so I’m just not sure what’s up.

Advertisement

Answer

You should first run the .fit() method and save the returned object and then run the .predict() method on that object.

results = model.fit()

Running results.params will produce this pandas Series:

Intercept   -0.875
b            0.375
dtype: float64

Then, running results.predict(test.b) will produce this Series:

0    3.625
1    4.375
2    3.250
3    4.750
dtype: float64

You can also retrieve model fit summary values by calling individual attributes of the results class (https://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.OLSResults.html):

>>> results.rsquared_adj
0.08928571428571419

But those will be for the full/train model, so yes, you will probably need to manually compute SSR/SST/SSE values from your test predictions and true values, and get the adjusted R-squared from that.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement