Skip to content
Advertisement

How to do linear regression, taking errorbars into account?

I am doing a computer simulation for some physical system of finite size, and after this I am doing extrapolation to the infinity (Thermodynamic limit). Some theory says that data should scale linearly with system size, so I am doing linear regression.

The data I have is noisy, but for each data point I can estimate errorbars. So, for example data points looks like:

JavaScript

Let’s say I am trying to do this in Python.

  1. First way that I know is:

    JavaScript

    I understand this gives me errorbars of the result, but this does not take into account errorbars of the initial data.

  2. Second way that I know is:

    JavaScript

Here we use the inverse of the errorbar for the each point as a weight that is used in the least square approximation. So if a point is not really that reliable it will not influence result a lot, which is reasonable.

But I can not figure out how to get something that combines both these methods.

What I really want is what second method does, meaning use regression when every point influences the result with different weight. But at the same time I want to know how accurate my result is, meaning, I want to know what are errorbars of the resulting coefficients.

How can I do this?

Advertisement

Answer

Not entirely sure if this is what you mean, but…using pandas, statsmodels, and patsy, we can compare an ordinary least-squares fit and a weighted least-squares fit which uses the inverse of the noise you provided as a weight matrix (statsmodels will complain about sample sizes < 20, by the way).

JavaScript

OLS vs WLS

WLS residuals:

JavaScript

The mean squared error of the residuals for the weighted fit (wls_fit.mse_resid or wls_fit.scale) is 0.22964802498892287, and the r-squared value of the fit is 0.754.

You can obtain a wealth of data about the fits by calling their summary() method, and/or doing dir(wls_fit), if you need a list of every available property and method.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement