Determine whether the Columns of a Dataset are invariant under any given Scikit-Learn Transformer

Question

Given an sklearn tranformer t, is there a way to determine whether t changes columns/column order of any given input dataset X, without applying it to the data? For example with t = sklearn.preprocessing.StandardScaler there is a 1-to-1 mapping between the columns of X and t.transform(X), namely X[:, i] -> t.transform(X)[:, i], whereas this is obviously not the case for

Accepted Answer

Not all your &#8220;transformers&#8221; would have the .get_feature_names_out method. Its implementation is discussed in the sklearn github. In the same link, you can see there is, to quote @thomasjpfan, a _OneToOneFeatureMixin class used by transformers with a simple one-to-one correspondence between input and output featuresRestricted to sklearn, we can check whether the transformer or estimator is a subclass of _OneToOneFeatureMixin , for example:from sklearn.decomposition import PCAfrom sklearn.preprocessing import StandardScalerfrom sklearn.feature_selection import SelectKBestfrom sklearn.base import _OneToOneFeatureMixintf = {'pca':PCA(),'standardscaler':StandardScaler(),'kbest':SelectKBest()}[i+":"+str(issubclass(type(tf[i]),_OneToOneFeatureMixin)) for i in tf.keys()]['pca:False', 'standardscaler:True', 'kbest:False']These would the source code for _OneToOneFeatureMixin

Advertisement

Answer