I cant find why `.read_csv` cannot make a dataframe for `.shape` to recognize

Following a machine learning guide here: https://www.pluralsight.com/guides/scikit-machine-learning/

Running Python 3.8, might have a hunch that I need to run it in IPython but I think that opens up a new can of worms.

Also have all imported these libraries installed.

I left %matplotlib inline as a comment because i’m not running it in Jupyter.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report

df = pd.read_csv("diabetes.csv")
print(pd.shape)
df.describe()

y = df[diabetes.csv].values
x = df.drop('diabetes', axis=1).values

X_train. X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state=42)
X_train.shape, X_test.shape
((460,8), (308,8))

logreg = LogisticRegression()
logreg.fit(X_train, y_train)

y_pred = logreg.predict(X_test)

print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

plt.show()

JavaScript
​x
 
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
​
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
​
df = pd.read_csv("diabetes.csv")
print(pd.shape)
df.describe()
​
y = df[diabetes.csv].values
x = df.drop('diabetes', axis=1).values
​
X_train. X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state=42)
X_train.shape, X_test.shape
((460,8), (308,8))
​
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
​
y_pred = logreg.predict(X_test)
​
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
​
plt.show()
​

The error I get when running this code is:

Traceback (most recent call last): File “scikitmlprac.py”, line 12, in print(pd.shape) File “C:Python38libsite-packagespandas_init_.py”, line 258, in getattr raise AttributeError(f”module ‘pandas’ has no attribute ‘{name}'”) AttributeError: module ‘pandas’ has no attribute ‘shape’

Answer

Seems like you just have a typo here. You’ve tried to print pd.shape when it should be df.shape. Hence the error.

Advertisement

Answer