How to find the regression line for multiple independent variables?

Question

I'm trying to understand how the Multiple Line Regression works in code for machine learning. The issue I'm having is that I don't get how to set up my regression line properly or if my coefficients are correct. So I guess I can divide my thoughts into three questions. Is my method of finding the coefficients for the regression line

Accepted Answer

In order,The method appears to be correct but rather long-winded. See below for a more compact alternativeNot sure what you mean but I think this:x1 = coeffs_multi[2]*np.linspace(0,120)y1 = coeffs_multi[1]*np.linspace(0,120)z1 = x1 + y1 + coeffs_multi[0]is not quite correct. The coefficients in coeffs_multi_reversed are in order dictated by X namely &#8216;constant&#8217;, &#8216;Weight&#8217;, &#8216;Volume&#8217;. In coeffs_multi they are then &#8216;Volume&#8217;, &#8216;Weight&#8217;, &#8216;constant&#8217;, so the above are in the wrong orderFor the plot I would not do x1, y1 etc but simply plot actual vs predicted by the model, like so:...predicted = np.array(A) @ coeffs_multi_reversedax.scatter(x, y, z, label = 'actual')ax.scatter(x, y, predicted, label = 'predicted')...the graph then looks like this:A much more standard way to do regression is as followsfrom sklearn.linear_model import LinearRegressionlin_regr = LinearRegression()lin_res = lin_regr.fit(x_cars, y_cars)predicted = lin_regr.predict(x_cars)print(lin_res.coef_, lin_res.intercept_)plt.plot(predicted, y_cars, '.', label = 'actual vs predicted')plt.plot(predicted, predicted, '.', label = 'predicted  vs predicted')plt.legend(loc = 'best')plt.show()prints[0.00755095 0.00780526] 79.69471929115937and plotsEdit: plotting 3D gridTo plot predicted output on a grid, you can do something likenpts = 20from mpl_toolkits import mplot3dfig = plt.figure()ax = plt.axes(projection='3d')x = x_cars['Weight']y = x_cars['Volume']ax.scatter(x, y, z, label = 'actual')x1 = np.linspace(x.min(), x.max(), npts)y1 = np.linspace(y.min(), y.max(), npts)x1m,y1m = np.meshgrid(x1,y1)z1 = lin_regr.predict(np.hstack([x1m.reshape(-1,1),y1m.reshape(-1,1)]))ax.scatter(x1m.reshape(-1,1), y1m.reshape(-1,1), z1, '.', s=1, label = 'predicted')ax.set_xlabel('x - Weight')ax.set_ylabel('y - Volume')ax.set_zlabel('z - $CO_2$')ax.set_title('$CO_2$ emission')plt.legend(loc = 'best')plt.show()for this kind of output:

Advertisement

Answer

Edit: plotting 3D grid