Skip to content
Advertisement

How to write code for a 5-fold Cross Validation?

I have code for splitting a data set dfXa of size 351 by 14 into 10 fold and choosing one fold for validation denoted by dfX_val of size 35 by 14 and resting 9 fold for training by dfX_train of size 316 by 14.

But how to do this for a 5-fold CV? I want to implement 5-fold CV without using the sklearn.

Advertisement

Answer

You can use cross_val_score from the scikit learn library as mentioned here.

from sklearn.model_selection import cross_val_score
estimator = KMeans(n_clusters=m, random_state=0)
scores = cross_val_score(estimator, X_train, y_train, scoring='accuracy', cv=5)

To get the labels, i.e., y_train values you can do:

X = df.loc[:, 2:].values
y = df.loc[:, 1].values

where df is your dataframe of size 351 by 14. I am assuming here the first comlumn of your data frame are labels, which normally is in such tasks.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement