How to find the regression line for multiple independent variables?

Question

I&#8217;m trying to understand how the Multiple Line Regression works in code for machine learning. The issue I&#8217;m having is that I don&#8217;t get how to set up my regression line properly or if my coefficients are correct. So I guess I can divide my thoughts into three questions. Is my method of findin…

Accepted Answer

In order,The method appears to be correct but rather long-winded. See below for a more compact alternativeNot sure what you mean but I think this:x1 = coeffs_multi[2]*np.linspace(0,120)y1 = coeffs_multi[1]*np.linspace(0,120)z1 = x1 + y1 + coeffs_multi[0]is not quite correct. The coefficients in coeffs_multi_reversed are in order dictated by X namely &#8216;constant&#8217;, &#8216;Weight&#8217;, &#8216;Volume&#8217;. In coeffs_multi they are then &#8216;Volume&#8217;, &#8216;Weight&#8217;, &#8216;constant&#8217;, so the above are in the wrong orderFor the plot I would not do x1, y1 etc but simply plot actual vs predicted by the model, like so:...predicted = np.array(A) @ coeffs_multi_reversedax.scatter(x, y, z, label = 'actual')ax.scatter(x, y, predicted, label = 'predicted')...the graph then looks like this:A much more standard way to do regression is as followsfrom sklearn.linear_model import LinearRegressionlin_regr = LinearRegression()lin_res = lin_regr.fit(x_cars, y_cars)predicted = lin_regr.predict(x_cars)print(lin_res.coef_, lin_res.intercept_)plt.plot(predicted, y_cars, '.', label = 'actual vs predicted')plt.plot(predicted, predicted, '.', label = 'predicted  vs predicted')plt.legend(loc = 'best')plt.show()prints[0.00755095 0.00780526] 79.69471929115937and plotsEdit: plotting 3D gridTo plot predicted output on a grid, you can do something likenpts = 20from mpl_toolkits import mplot3dfig = plt.figure()ax = plt.axes(projection='3d')x = x_cars['Weight']y = x_cars['Volume']ax.scatter(x, y, z, label = 'actual')x1 = np.linspace(x.min(), x.max(), npts)y1 = np.linspace(y.min(), y.max(), npts)x1m,y1m = np.meshgrid(x1,y1)z1 = lin_regr.predict(np.hstack([x1m.reshape(-1,1),y1m.reshape(-1,1)]))ax.scatter(x1m.reshape(-1,1), y1m.reshape(-1,1), z1, '.', s=1, label = 'predicted')ax.set_xlabel('x - Weight')ax.set_ylabel('y - Volume')ax.set_zlabel('z - $CO_2$')ax.set_title('$CO_2$ emission')plt.legend(loc = 'best')plt.show()for this kind of output:

Advertisement

Answer

Edit: plotting 3D grid