Skip to content
Advertisement

How to create Predicted vs. Actual plot using abline_plot and statsmodels

I am trying to recreate this plot from this website in Python instead of R: enter image description here

Background

I have a dataframe called boston (the popular educational boston housing dataset).

I created a multiple linear regression model with some variables with statsmodels api below. Everything works.

JavaScript

I create a dataframe of actual values from the boston dataset and predicted values from above linear regression model.

JavaScript

This is where I get stuck. When I try to plot the regression line on top of the scatterplot, I get this error below.

JavaScript

Here are the independent variables I used in the linear regression if that helps:

JavaScript

Advertisement

Answer

That R plot is actually for predicted ~ actual, but your python code passes the medv ~ ... model into abline_plot.

To recreate the R plot in python:

  • either use statsmodels to manually fit a new predicted ~ actual model for abline_plot
  • or use seaborn.regplot to do it automatically

Using statsmodels

If you want to plot this manually, fit a new predicted ~ actual model and pass that model into abline_plot. Then, generate the confidence band using the summary_frame of the prediction results.

JavaScript

Alternative to abline_plot, you can use matplotlib’s built-in axline by extracting the intercept and slope from the model’s params:

JavaScript


Using seaborn

Note that it’s much simpler to let seaborn.regplot handle this automatically:

JavaScript

With seaborn, it’s also trivial to plot a polynomial fit via the order param:

JavaScript

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement