Skip to content
Advertisement

Apply transformation only on string columns with Pandas, ignoring numeric data

So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?

> df

    A   B   C
0   10  John    Dog
1   12  Jack    Cat
2   54  Mary    Monkey
3   23  Bob     Horse

Than, after using something like df.applymap(str.lower) I would get:

> df

    A   B   C
0   10  john    dog
1   12  jack    cat
2   54  mary    monkey
3   23  bob     horse

Currently it’s showing this error message:

TypeError: descriptor 'lower' requires a 'str' object but received a 'int'

Advertisement

Answer

From pandas 1.X you can efficiently select string-only columns using select_dtypes("string"):

string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower())

df
    A     B       C
0  10  john     dog
1  12  jack     cat
2  54  mary  monkey
3  23   bob   horse

df.dtypes

A     int64
B    string
C    string
dtype: object

This avoids operating on non-string data.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement