How to implement a constrained linear fit in Python?

Question

I'm trying to fit a linear model to a set of data, with the constraint that all the residuals (model - data) are positive - in other words, the model should be the "best overestimate". Without this constraint, linear models can be easily found with numpy's polyfit as shown below. example1 Is there an efficient way to implement a linear

Accepted Answer

This is a quadratic programming problem. There are several libraries (CVXOPT, quadprog etc.) that can be used to solve it. Here is an example using quadprog:import numpy as npimport matplotlib.pyplot as pltimport quadprogx = [-4.12179107e-01, -1.40664082e-01, -5.52301563e-06, 1.82898473e-01]y = [-4.14846251, -3.31607886, -3.57827245, -5.09914559]A = np.c_[x, np.ones(len(x))]y = np.array(y)G = A.T @ Aa = A.T @ yC = A.Tb = ycoeffs = quadprog.solve_qp(G, a, C, b)[0]plt.scatter(x, y)plt.plot(x, np.polyval(coeffs, x), c='r')plt.show()This gives:See e.g. this post for more information. In describes, in particular, how to set up a linear regression problem as a quadratic programming problem.As a side note, the optimal line will always pass through one data point, but it need not pass through two such points. For example, take x = [-1., 0., 1.] and y = [1., 2., 1.].

Advertisement

Answer