This may be a dumb question but I’ve searched through pyMC3 docs and forums and can’t seem to find the answer. I’m trying to create a linear regression model from a dataset that I know a priori should not have an intercept. Currently my implementation looks like this:
formula = 'Y ~ ' + ' + '.join(['X1', 'X2']) # Define data to be used in the model X = df[['X1', 'X2']] Y = df['Y'] # Context for the model with pm.Model() as model: # set distribution for priors priors = {'X1': pm.Wald.dist(mu=0.01), 'X2': pm.Wald.dist(mu=0.01) } family = pm.glm.families.Normal() # Creating the model requires a formula and data pm.GLM.from_formula(formula, data = X, family=family, priors = priors) # Perform Markov Chain Monte Carlo sampling trace = pm.sample(draws=4000, cores = 2, tune = 1000)
As I said, I know I shouldn’t have an intercept but I can’t seem to find a way to tell GLM.from_formula() to not look for one. Do you all have a solution? Thanks in advance!
Advertisement
Answer
I’m actually puzzled that it does run with an intercept since the default in the code for GLM.from_formula
is to pass intercept=False
to the constructor. Maybe it’s because the patsy
parser defaults to adding an intercept?
Either way, one can explicitly include or exclude an intercept via the patsy formula, namely with 1
or 0
, respectively. That is, you want:
formula = 'Y ~ 0 + ' + ' + '.join(['X1', 'X2'])