How to implement a constrained linear fit in Python?

Question

I&#8217;m trying to fit a linear model to a set of data, with the constraint that all the residuals (model &#8211; data) are positive &#8211; in other words, the model should be the &#8220;best overestimate&#8221;. Without this constraint, linear models can be easily found with numpy&#8217;s polyfit as shown …

Accepted Answer

This is a quadratic programming problem. There are several libraries (CVXOPT, quadprog etc.) that can be used to solve it. Here is an example using quadprog:import numpy as npimport matplotlib.pyplot as pltimport quadprogx = [-4.12179107e-01, -1.40664082e-01, -5.52301563e-06, 1.82898473e-01]y = [-4.14846251, -3.31607886, -3.57827245, -5.09914559]A = np.c_[x, np.ones(len(x))]y = np.array(y)G = A.T @ Aa = A.T @ yC = A.Tb = ycoeffs = quadprog.solve_qp(G, a, C, b)[0]plt.scatter(x, y)plt.plot(x, np.polyval(coeffs, x), c='r')plt.show()This gives:See e.g. this post for more information. In describes, in particular, how to set up a linear regression problem as a quadratic programming problem.As a side note, the optimal line will always pass through one data point, but it need not pass through two such points. For example, take x = [-1., 0., 1.] and y = [1., 2., 1.].

Advertisement

Answer