Code:
JavaScript
x
21
21
1
import numpy as np
2
import pandas as pd
3
import statsmodels.api as sm
4
5
sacramento = pd.read_csv("sacramento.csv")
6
7
X = sacramento[["beds", "sqft", "price"]]
8
Y = sacramento["baths"]
9
10
X = sm.add_constant(X)
11
12
model = sm.Logit(Y, X).fit()
13
predictions = model.predict(X)
14
15
print_model = model.summary()
16
print(print_model)
17
18
print(mod.params.round(2))
19
print(mod.pvalues.round(2))
20
print('The smallest p-value is for sqft')
21
The problem I have is with the “You will need to create a new variable from baths, and it should make it such that those observations of 1 bath correspond to a value of 0, and those with more than 1 bath correspond to a 1.” instruction.
I really do not know how to do that. I know that it causes a ValueError: endog must be in the unit interval
.
Link to the csv file: https://drive.google.com/file/d/1A3LQ2vZ9IUkv_2HkqP8c2sCQGAvdII-r/view?usp=sharing
Advertisement
Answer
Can you try this?
JavaScript
1
2
1
sacramento["baths"] = sacramento["baths"].apply(lambda x: 0 if x== 1 else 1)
2