Skip to content
Advertisement

Python lmfit – how to calculate R squared?

This may be a stupid question, but I didn’t find an answer to it anywhere in lmfit’s documentation. My question is simple: how do I retrieve R squared? (I know I can calculate it manually with 1 - SS_res / SS_tot)

Update: I tried calculating R squared myself and compared it to the R squared from statsmodels. Parameters are the same in both estimations, but R squared is not.

Code:

from lmfit import minimize, Parameters
import numpy as np
import statsmodels.api as sm
import random


x = np.linspace(0, 15, 10)
x_ols = sm.add_constant(x)
y = [random.randint(0,15) for r in xrange(10)]

model = sm.OLS(y,x_ols)
results = model.fit()
print "OLS: ", format(results.params[0], '.5f'), format(results.params[1], '.5f'), "R^2: ", results.rsquared


# define objective function: returns the array to be minimized
def fcn2min(params, x, data):
    a = params['a'].value
    b = params['b'].value

    model = a + b * x
    return model - data

for i in range(0,1):
    # create a set of Parameters
    params = Parameters()
    params.add('a', value= i)
    params.add('b', value= 20)

    # do fit, here with leastsq model
    result = minimize(fcn2min, params, args=(x, y))

    yhat = params['a'].value + params['b'].value * x
    ybar = np.sum(y)/len(y)
    ssreg = np.sum((yhat-ybar)**2)   # or sum([ (yihat - ybar)**2 for yihat in yhat])
    sstot = np.sum((y - ybar)**2)    # or sum([ (yi - ybar)**2 for yi in y])
    r2 = ssreg / sstot

    print "lmfit: ", format(params['a'].value, '.5f'), format(params['b'].value, '.5f'), "R^2: ", r2

Advertisement

Answer

I don’t see an included rsquared in lmfit, but we can reuse either the residuals or the redchi

I am using a similar example where y contains additional noise

lmfit result (assuming the mean residual equals zero, which is always true for linear regression)

>>> 1 - result.residual.var() / np.var(y)
0.98132815639800652
>>> 1 - result.redchi / np.var(y, ddof=2)
0.9813281563980063

compared to OLS results:

>>> results.rsquared
0.98132815639800663

This is the definition of rsquared when we compare to a model with just an intercept and without weights.

The calculations for rsquared in statsmodels are adjusted for the case when the regression does not include an intercept, and they takes weights into account for weighted least squares.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement