Skip to content
Advertisement

Cannot seem to pass pandas DataFrame into feature_engine.selection.DropHighPSIFeatures fit method correctly

I could not get the code to calculate psi values to work and I am not very familiar with feature_engine library or in general ML related operations.

The code I am currently trying to run is:

JavaScript

The error message returning is:

JavaScript

The dataframe print statement in the previous code snippet is:

JavaScript

So I assumed I don’t have anything problematic in the dataframe itself (apart from the Unnamed: 0_y column maybe)

However, just in case the method in which I create the dataframe from 3 long list csv files and a key mapping csv file is this:

JavaScript

Advertisement

Answer

Turns out that the problem was caused by either the data on the the DataFrame (long_list) being too sparse (too many NaN values) or it being too large. I haven’t done the experiment to figure out which one, but the problem was resolved when I dropped columns with a lot of NaN values.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement