Skip to content
Advertisement

Tag: scikit-learn

How to build a custom scaler based on StandardScaler?

I am trying to build a custom scaler to scale only the continuous variables on a dataset (the US Adult Income: https://www.kaggle.com/uciml/adult-census-income), using StandardScaler as a base. Here is my Python code that I used: However when I tried to run the scaler, I met this problem: So what is the error that I have on building the scaler? And

Cross-validation with time series data in sklearn

I have a question with regard to cross-validation of time series data in general. The problem is macro forecasting, e.g. forecasting the 1-month ahead Price of the S&P500 using different monthly macro variables. Now I read about the following approach: One should/could use a rolling cross-validation approach. I.e. always drop an old monthly value and add a new one (=

__init__() got an unexpected keyword argument ‘handle_unknown’

I’m trying to Ordinal Encode my categorical features using sklearn, but I get the error __init__() got an unexpected keyword argument ‘handle_unknown’ when I compile the below code: A sample data to reproduce the error: Could someone please tell me what’s wrong in my code? Answer You are most likely not using an appropriate version of scikit-learn. handle_unknown and unknown_value

Micro metrics vs macro metrics

To test the results of my multi-label classfication model, I measured the Precision, Recall and F1 scores. I wanted to compare two different results, Micro and Macro. I have a dataset with few rows, but my label count is around 1700. Why is the macro so low even though I get a high result in micro, which one would be

scikit preprocessing across entire dataframe

I have a dataframe: The data is an average response of the same question asked across 4 quarters. I am trying to create a benchmark index from this data. To do so I wanted to preprocess it first using either standardize or normalize. How would I standardize/normalize across the entire dataframe. What is the best way to go about this?

Installing scipy and scikit-learn on apple m1

The installation on the m1 chip for the following packages: Numpy 1.21.1, pandas 1.3.0, torch 1.9.0 and a few other ones works fine for me. They also seem to work properly while testing them. However when I try to install scipy or scikit-learn via pip this error appears: ERROR: Failed building wheel for numpy Failed to build numpy ERROR: Could

cannot import name ‘stop_words’ from ‘sklearn.feature_extraction’

I’ve been trying to follow an NLP notebook, and they use: However, this is throwing the following error: My guess is that stop_words is not (or maybe no longer) part of the ‘feature_extraction’ part of sklearn, but I might be wrong. I have seen some articles that used sklearn.feature_extraction.stop_words, but at the same time I see places which have used

Advertisement