Tag: scikit-learn

Pipline with SMOTE and Imputer Errors

machine-learning pandas pipeline python scikit-learn

i am trying to create a pipeline that first impute missing data , do oversampling with the SMOTE and the the model my code worked perfectly before i try smote not i cant find any solution here is the code without smote And here’s the code after adding smote Note: I tired importing make pipeline from iml…

How to create a for loop with checking appended models

catboost if-statement python scikit-learn

I have a list of models that I iterate through in a for loop getting their performances. I’ve added catboost to my model list, but when I try to add it’s best estimator to a dictionary it gives me an error no other models give me (TypeError: unhashable type: ‘CatBoostRegressor’). Googl…

Why isn’t this Linear Regression line a straight line?

linear-regression python scikit-learn

I have points with x and y coordinates I want to fit a straight line to with Linear Regression but I get a jagged looking line. I am attemting to use LinearRegression from sklearn. To create the points run a for loop that randomly crates one hundred points into an array that is 100 x 2 in shape. I slice

Is RandomOverSampler Causing my Model to Overfit?

imblearn multilabel-classification overfitting-underfitting python scikit-learn

I am attempting to see how well I can classify books according to genre using TfidfVectorizer. I am using five moderately imbalanced genre labels, and I want to use multilabel classification to assign each document one or more genres. Initially my performance was middling, so I tried to fix this by re-balanci…

Fit/transform separate sklearn transformers to partitions of single column

finance python quantitative-finance scikit-learn

Use case: I have time series data for multiple assets (eg. AAPL, MSFT) and multiple features (eg. MACD, Volatility etc). I am building a ML model to make classification predictions on a subset of this data. Problem: For each asset & feature – I want to fit and apply a transformation. For example: fo…

Is there a way to use mutual information as part of a pipeline in scikit learn?

python scikit-learn

I’m creating a model with scikit-learn. The pipeline that seems to be working best is: mutual_info_classif with a threshold – i.e. only include fields whose mutual information score is above a given threshold. PCA LogisticRegression I’d like to do them all using sklearn’s pipeline obje…

Constrained Multi-Linear Regression using Gekko

gekko linear-regression pandas python scikit-learn

I have a multilinear regression problem where I have the prior information about the range of the output (dependent variable y) – The prediction must always lie in that range. I want to find the coefficients (upper and lower bound) of each feature (independent variables) in order to make the linear regr…

Linear regression prediction based on group of data in test set

machine-learning pandas predict python scikit-learn

I have a simple dataset which looks like this: I created a simple LR model to train and predict the target variable “sales”. And I used MAE to evaluate the model My code works well, but what I want to do is to predict the sales in the X_test grouped by hour of the day. In the above dataset example…

Extracting feature names from sklearn column transformer

pandas pipeline python scikit-learn

I’m using sklearn.pipeline to transform my features and fit a model, so my general flow looks like this: column transformer –> general pipeline –> model. I would like to be able to extract feature names from the column transformer (since the following step, general pipeline applies the…

Eli5.Sklearn PermutationImportance() — TypeError: check_cv() takes from 0 to 2 positional arguments but 3 were given

eli5 python scikit-learn

I am running permutation importance from eli5.sklearn. I keep getting this error : I am unsure how to go about this as I am only passing 2 arguments into perm.fit() Any advice would be appreciated. Thank You link to error message image Answer This is a known error, fixed in the master branch of the spinoff re…