Skip to content
Advertisement

Tag: scikit-learn

How to use `Dirichlet Process Gaussian Mixture Model` in Scikit-learn? (n_components?)

My understanding of “an infinite mixture model with the Dirichlet Process as a prior distribution on the number of clusters” is that the number of clusters is determined by the data as they converge to a certain amount of clusters. This R Implementation https://github.com/jacobian1980/ecostates decides on the number of clusters in this way. Although, the R implementation uses a Gibbs

label-encoder encoding missing values

I am using the label encoder to convert categorical data into numeric values. How does LabelEncoder handle missing values? Output: For the above example, label encoder changed NaN values to a category. How would I know which category represents missing values? Answer Don’t use LabelEncoder with missing values. I don’t know which version of scikit-learn you’re using, but in 0.17.1

Backpropagation with Momentum using Scikit-Learn

I’m trying to use Scikit-Learn’s Neural Network to classify my dataset using a Backpropagation with Momentum. I need to specify these parameters: Hidden neurons, Hidden layers, Training set, Learning rate and Momentum. I found MLPClassifier in Sklearn.neural_network package. The problem is that this package is part of Scikit-learn V0.18 which is a dev version. Is there a way I could

How to use sklearn fit_transform with pandas and return dataframe instead of numpy array?

I want to apply scaling (using StandardScaler() from sklearn.preprocessing) to a pandas dataframe. The following code returns a numpy array, so I lose all the column names and indeces. This is not what I want. A “solution” I found online is: It appears to work, but leads to a deprecationwarning: /usr/lib/python3.5/site-packages/sklearn/preprocessing/data.py:583: DeprecationWarning: Passing 1d arrays as data is deprecated in

Advertisement