I’m trying to figure out how to use RFE for regression problems, and I was reading some tutorials. I found an example on how to use RFECV to automatically select the ideal number of features, and it goes something like: which I find pretty straightforward. However, I was checking how to do the same thin…
Tag: scikit-learn
Pipeline with count and tfidf vectorizer produces TypeError: expected string or bytes-like object
I have a corpus like the following ‘C C C 0 0 0 X 0 1 0 0 0 0’, ‘C C C 0 0 0 X 0 1 0 0 0 0’, ‘C C C 0 0 0 X 0 1 0 0 0 0’, ‘X X X’, ‘X X X’, ‘X X X’, I would like to use
How can i give Gaussian noise to my moons dataset with a deviation value of 0.2 in python?
I have make_moons dataset, generated by scikit-learn X, y = make_moons(n_samples=120) How can i give Gaussian noise to my moons dataset with a deviation value of 0.2 in python? Answer You can just pass that value to the make_moons function as noise. noise : double or None (default=None) Standard deviation of …
SGDRegressor() constantly not increasing validation performance
The model fit of my SGDRegressor wont increase or decrease its performance on the validation set (test) after around 20’000 training records. Even if I try to switch penalty, early_stopping (True/False) or alpha,eta0 to extremely high or low levels, there is no change in the behavior of the “stuck…
Why does sklearn MinMaxScaler() return an out-of-range value instead of an error?
When I use sklearn MinMaxScaler(), I noticed some interesting behavior which shown in the following code. I noticed that when I transform the test_data with fitted MinMaxScaler(), it returns values beyond the defined range (0 – 1). Now, I intentionally make the test_data to be outside the value range of…
plotting a 3d graph of a regressor made with sklearn
I have been using this tutorial to learn decision tree learning, and am now trying to understand how it works with higher dimensional datasets. Currently my regressor predicts a Z value for an (x,y) pair that you pass to it. I want to use a 3d graph to visualise it, but I have struggled with the way regressor…
I cant find why `.read_csv` cannot make a dataframe for `.shape` to recognize
Following a machine learning guide here: https://www.pluralsight.com/guides/scikit-machine-learning/ Running Python 3.8, might have a hunch that I need to run it in IPython but I think that opens up a new can of worms. Also have all imported these libraries installed. I left %matplotlib inline as a comment be…
Fix parameters of Gaussian mixture model, instead of learning
Let us say I have a dataset data that I use to fit a Gaussian mixture model: I now store the learnt covariances fit_model.covariances_, means fit_model.means_ and weights fit_model.weights_. From a different script, I want to read in the learnt parameters and define a Gaussian mixture model using them. How do…
How many epochs does scikit learn use when cross validating?
I’m doing some model cross validation with scikit learn in time series data where a Multi Layer Perceptron is trained with Keras. (We are able to use cross_val_score from scikit learn thanks to the keras wrapper). Basically using: The issue is I don’t understand how many epochs its using on each t…
Does it make sense? If yes then how to handle in MSE?
Can we do log transform to one variable and sqrt to another for LinearRegression? If yes then what to do during MSE? Should I exp or square the y_test and prediction? Answer If you transform variables in training and test sets you don’t need to care about your evaluation metric. In case you transform yo…