SVM working well on test subset fails on whole dataset

I trained a SVM iterativly on large chunks of data using sklearn. Each csv file is a part of an image. I made those with a sliding window aproach. I used partial_fit() for fitting the SVM as well as …

Training on multiple data sets with scikit.mlpregressor

I’m currently training my first neural network on a larger dataset. I have splitted my training data to several .npy binary files, that each contain batches of 20k training samples. I’m loading the …

im trying to learn scikit but stucked at the code which is about encoders require their input to be be uniformly string or number

I have been learning python form youtube videos. im new to python just a beginner. I saw this code on video so i tried it but getting the error which i dont known how to solve. This is the following …

Why Python’s scikit-learn K-Means text clustering algorithm always provides different retult

I have a list of documents and this class to perform actions on that list. So, basically, morphed_documents is a list of strings. And at the end, the algorithm returns the cluster for each document. …

K-Fold cross validation for Lasso and Ridge models

I’m working with the Boston housing dataset from sklearn.datasets and have run ridge and lasso regressions on my data (post train/test split). I’m now trying to perform k-fold cross validation to find …

How to change plot legends with roc_auc_score?

I’m plotting ROC curve with plot_roc_curve of scikit-learn, that plot legends are printed automatically. Is there a way to change them? metrics.plot_roc_curve(classifier, X_test, y_test, ax=plt.gca())

AttributeError: ‘numpy.ndarray’ object has no attribute ‘score’ error

I have tried to look for a problem but there is nothing Im seeing wrong here. What could it be? This is for trying binary classification in SVM for the fashion MNIST data set but only classifying 5 …

Isolation forest with multiple features detecting everything as an anomaly

I have an isolation forest implementation where I take the features (all are numerical); scale them to be between 0 and 1 from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() data = …

Do Machine Learning Algorithms read data top-down or bottom up?

I’m new to Machine Learning and I’m a bit confused about how data is being read for the training/testing process. Assuming my data works with date and I want the model to read the later dates first …

xlearn predictions error give a different mse than output by the function

the xlearn predict function gives a different mse than what you get by looking at the predictions and calculating it yourself. Here is code to do this; you can run it by cloning the xlearn repository …