Skip to content
Advertisement

PatsyError when using statsmodels for regression

I’m using ols in statsmodels to run a regression. Once I run the regressions on each row of my dataframe, I want to retrieve the X variables from patsy thats used in those regressions. But, I get an error that I just cant seem to understand.

Edit: I am trying to run a regression as presented in the answer here, but want to run the regression across each row of a grouped version of my dataframe df, where it is grouped by Date,bal, dist, pay_hist, inc, bckts. So I first group this data as described above and then try to run the regression on each row where df is grouped by Date: df.groupby(['Date']).apply(ols_coef,'bal ~ C(dist) + C(pay_hist) + C(inc) + C(bckts)')

My code is as follows:

JavaScript

I get the following error and am unsure how to solve it:

JavaScript

Advertisement

Answer

The problem is that you’re passing a grouped dataframe into thepasty.dmatrices function. Since the grouped dataframe is iterable, you can do it in a loop like this, and store all of your X dataframs (one for each group) into a dictionary:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement