Pipline with SMOTE and Imputer Errors

i am trying to create a pipeline that first impute missing data , do oversampling with the SMOTE and the the model

my code worked perfectly before i try smote not i cant find any solution

here is the code without smote

scoring = ['balanced_accuracy', 'f1_macro']
imputer = SimpleImputer(strategy='most_frequent')
pipeline = Pipeline(steps=[('i', imputer),('m', model)])
# define model evaluation
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_validate(pipeline, X, y, scoring=scoring, cv=cv, n_jobs=-1)

And here’s the code after adding smote Note: I tired importing make pipeline from imlearn

imputer = SimpleImputer(strategy='most_frequent')
pipeline = Pipeline(steps=[('i', imputer),('over', SMOTE()),('m', model)])
# define model evaluation
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_validate(pipeline, X, y, scoring=scoring, cv=cv, n_jobs=-1)

when i import Pipeline From SKLearn i got this error

All intermediate steps should be transformers and implement fit and transform or be the string ‘passthrough’ ‘SMOTE()’ (type <class ‘imblearn.over_sampling._smote.base.SMOTE’>) doesn’t

when i tried importing makepipeline from imlearn i get this error

Last step of Pipeline should implement fit or be the string ‘passthrough’. ‘[(‘i’, SimpleImputer(strategy=’most_frequent’)), (‘over’, SMOTE()), (‘m’, RandomForestClassifier())]’ (type <class ‘list’>) doesn’t

Answer

Use the imblearn pipline:

from imblearn.pipeline import Pipeline 
pipeline = Pipeline([('i', imputer),('over', SMOTE()),('m', model)])

Advertisement

Answer