Standardizing a set of columns in a pandas dataframe with sklearn

Question

I have a table with four columns: CustomerID, Recency, Frequency and Revenue. I need to standardize (scale) the columns Recency, Frequency and Revenue and save the column CustomerID. I used this code: But the result is a table without the column CustomerID. Is there any way to get a table with the corresponding CustomerID and the scaled columns? Answer fit_transform

Accepted Answer

fit_transform returns an ndarray with no indices, so you are losing the index you set on df.set_index('CustomerID', inplace = True).Instead of doing this, you can simply take the subset of columns you need to transform, pass them to StandardScaler, and overwrite the original columns.# Subset of columns to transformcols = ['Recency','Frequency','Revenue']# Overwrite old columns with transformed columnsdf[cols] = StandardScaler.fit_transform(df[cols])This way, you leave CustomerID completely unchanged.

Advertisement

Answer