Tag: scikit-learn

Sklearn NearestNeighbors (Mahalanobis) – too many arguments?

I’m using scikit-learn’s NearestNeighbors with Mahalanobis distance. d1 and d2 are both numpy arrays of 2-element lists of numbers. e.g.: I’ve used almost this exact code in the past, but today I’m getting the following error: Any tips on how to resolve this would be wildly appreciated! Thanks! Answer change ‘V’ to ‘VI’, maybe this help:

Modifying data frame containing NaN value so that I don’t get not a number error on division

machine-learning python scikit-learn

I have a pandas data frame with some NaN values which I have replaced by Now one of my functions does the following: Since I have replaced the NaN value by “”, I am getting What should be the apt way to fill the NaN values so that I can get rid of this particular error? Answer You could filter

How do I load a dataframe in Python sklearn?

dataframe pandas python scikit-learn

I did some computations in an IPython Notebook and ended up with a dataframe df which isn’t saved anywhere yet. In the same IPython Notebook, I want to work with this dataframe using sklearn. df is a dataframe with 4 columns: id (string), value(int), rated(bool), score(float). I am trying to determine what influences the score the most just like in

Keras: Does model.predict() require normalized data if I train the model with normalized data?

keras machine-learning python scikit-learn tensorflow

After completing model training using Keras I am trying to use Keras’ model.predict() in order to test the model on novel inputs. When I trained the model, I normalized my training data with Scikit Learn’s MinMaxScaler(). Do I need to normalize the data as well when using model.predict()? If so, how do I do it? Answer Yes. You need. Because

How to print KMeans intiatial parameters?

k-means python python-3.x scikit-learn

I am using PyCharm to run Kmeans using Iris data. When I run this, simply prints KMeans() But I would like it to print the following: How can this be accomplished? Answer Simply run kmeans.get_params(). This will print out the parameters (default or custom) used while instantiating the function in a dictionary format. Please refer this link for more information.

Standardizing a set of columns in a pandas dataframe with sklearn

pandas python scikit-learn standardized

I have a table with four columns: CustomerID, Recency, Frequency and Revenue. I need to standardize (scale) the columns Recency, Frequency and Revenue and save the column CustomerID. I used this code: But the result is a table without the column CustomerID. Is there any way to get a table with the corresponding CustomerID and the scaled columns? Answer fit_transform

Sklearn ROC AUC Score : ValueError: y should be a 1d array, got an array of shape (15, 2) instead

python scikit-learn

I have this dataset with target LULUS, it’s an imbalance dataset. I’m trying to print roc auc score if I could for each fold of my data but in every fold somehow it’s always raise error saying ValueError: y should be a 1d array, got an array of shape (15, 2) instead.. I’m kind of confused which part I did

SVM working well on test subset fails on whole dataset

machine-learning python scikit-learn

I trained a SVM iterativly on large chunks of data using sklearn. Each csv file is a part of an image. I made those with a sliding window aproach. I used partial_fit() for fitting the SVM as well as the scaler. The features are the RGBN values of an image, I want to classify the image in two different groups

Training on multiple data sets with scikit.mlpregressor

machine-learning python scikit-learn

I’m currently training my first neural network on a larger dataset. I have splitted my training data to several .npy binary files, that each contain batches of 20k training samples. I’m loading the data from the npy files, apply some simple pre-processing operations, and then start to train my network by applying the partial_fit method several times in a loop:

Plot confusion matrix with Keras data generator using sklearn

keras python scikit-learn tensorflow2.0

Sklearn clearly defines how to plot a confusion matrix using its own classification model with plot_confusion_matrix. But what about using it with Keras model using data generators? Let’s have a look at an example code: First we need to train the model. Now after the model is trained let’s build a confusion matrix. Now this works fine so far. But