So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?
> df
A B C
0 10 John Dog
1 12 Jack Cat
2 54 Mary Monkey
3 23 Bob Horse
Than, after using something like df.applymap(str.lower) I would get:
> df
A B C
0 10 john dog
1 12 jack cat
2 54 mary monkey
3 23 bob horse
Currently it’s showing this error message:
TypeError: descriptor 'lower' requires a 'str' object but received a 'int'
Advertisement
Answer
From pandas 1.X you can efficiently select string-only columns using select_dtypes("string"):
string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower())
df
A B C
0 10 john dog
1 12 jack cat
2 54 mary monkey
3 23 bob horse
df.dtypes
A int64
B string
C string
dtype: object
This avoids operating on non-string data.