So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?
> df A B C 0 10 John Dog 1 12 Jack Cat 2 54 Mary Monkey 3 23 Bob Horse
Than, after using something like df.applymap(str.lower) I would get:
> df A B C 0 10 john dog 1 12 jack cat 2 54 mary monkey 3 23 bob horse
Currently it’s showing this error message:
TypeError: descriptor 'lower' requires a 'str' object but received a 'int'
Advertisement
Answer
From pandas 1.X you can efficiently select string-only columns using select_dtypes("string")
:
string_dtypes = df.convert_dtypes().select_dtypes("string") df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower()) df A B C 0 10 john dog 1 12 jack cat 2 54 mary monkey 3 23 bob horse df.dtypes A int64 B string C string dtype: object
This avoids operating on non-string data.