How to go about making available the clf.best_params_
after carrying a pipeline
? For the code I have below, I get an:
AttributeError: 'GridSearchCV' object has no attribute 'best_params_
‘
Here is my code:
from sklearn.datasets import make_classification import numpy as np from sklearn import metrics from sklearn.metrics import accuracy_score from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import RandomizedSearchCV, GridSearchCV f, (ax1,ax2) = plt.subplots(nrows=1, ncols=2,figsize=(20,8)) # Generate noisy Data num_trainsamples = 500 num_testsamples = 50 X_train,y_train = make_classification(n_samples=num_trainsamples, n_features=240, n_informative=9, n_redundant=0, n_repeated=0, n_classes=10, n_clusters_per_class=1, class_sep=9, flip_y=0.2, #weights=[0.5,0.5], random_state=17) X_test,y_test = make_classification(n_samples=50, n_features=num_testsamples, n_informative=9, n_redundant=0, n_repeated=0, n_classes=10, n_clusters_per_class=1, class_sep=10, flip_y=0.2, #weights=[0.5,0.5], random_state=17) from sklearn.pipeline import Pipeline from sklearn.decomposition import PCA from sklearn.preprocessing import StandardScaler from sklearn.ensemble import RandomForestClassifier pipe = Pipeline([('scaler', StandardScaler()), ('pca', PCA(n_components=0.95)), ('clf', RandomForestClassifier())]) # Declare a hyperparameter grid parameter_space = { 'clf__n_estimators': [10,50,100], 'clf__criterion': ['gini', 'entropy'], 'clf__max_depth': np.linspace(10,50,11), } clf = GridSearchCV(pipe, parameter_space, cv = 5, scoring = "accuracy", verbose = True) # model pipe.fit(X_train,y_train) print(f'Best Parameters: {clf.best_params_}')
Advertisement
Answer
Your clf
is never fitted. You probably meant clf.fit(X_train,y_train)
.
Also, np.linspace(10,50,11)
yields floats, while max_depth
expects ints, so this may fail and you should probably add a type cast there (like np.linspace(10,50,11).astype('int')
) or use something like arange()
instead.
You should likely also fix your test set, which currently has no relation with the train one. Last but not least, PCA is not guaranteed to be useful for classification (see e.g. https://www.csd.uwo.ca/~oveksler/Courses/CS434a_541a/Lecture8.pdf).