Skip to content
Advertisement

Tag: scikit-learn

Add features to the “numeric” dataset whose categorical value must be mapped using a conversion formula

I have this dataset: This is the request: “Add the Mjob and Fjob attributes to the “numeric” dataset whose categorical value must be mapped using a conversion formula of your choice.” Does anyone knows how to do it? For example: if ‘at_home’ value become ‘1’ in Mjob, I want the same result in the Fjob column. Same categorical values must

Missing categorical data should be encoded with an all-zero one-hot vector

I am working on a machine learning project with very sparsely labeled data. There are several categorical features, resulting in roughly one hundred different classes between the features. For example: After I put these through scikit’s OneHotEncoder I am expecting the missing data to be encoded as 00, since the docs state that handle_unknown=’ignore’ causes the encoder to return an

How to specify Search Space in Auto-Sklearn

I know how to specify Feature Selection methods and the list of the Algorithms used in Auto-Sklearn 2.0 I know that Auto-Sklearn use Bayesian Optimisation SMAC but I would like to specify the HyperParameters in AutoSklearn For example, I want to specify random_forest with Estimator = 1000 only or MLP with HiddenLayerSize = 100 only. How to do that? Answer

Gaussian Process Regression: tune hyperparameters based on validation set

In the standard scikit-learn implementation of Gaussian-Process Regression (GPR), the hyper-parameters (of the kernel) are chosen based on the training set. Is there an easy to use implementation of GPR (in python), where the hyperparemeters (of the kernel) are chosen based on a separate validation set? Or cross-validation would also be a nice alternative to find suitable hyperparameters (that are

Advertisement