So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?
JavaScript
x
8
1
> df
2
3
A B C
4
0 10 John Dog
5
1 12 Jack Cat
6
2 54 Mary Monkey
7
3 23 Bob Horse
8
Than, after using something like df.applymap(str.lower) I would get:
JavaScript
1
8
1
> df
2
3
A B C
4
0 10 john dog
5
1 12 jack cat
6
2 54 mary monkey
7
3 23 bob horse
8
Currently it’s showing this error message:
JavaScript
1
2
1
TypeError: descriptor 'lower' requires a 'str' object but received a 'int'
2
Advertisement
Answer
From pandas 1.X you can efficiently select string-only columns using select_dtypes("string")
:
JavaScript
1
17
17
1
string_dtypes = df.convert_dtypes().select_dtypes("string")
2
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower())
3
4
df
5
A B C
6
0 10 john dog
7
1 12 jack cat
8
2 54 mary monkey
9
3 23 bob horse
10
11
df.dtypes
12
13
A int64
14
B string
15
C string
16
dtype: object
17
This avoids operating on non-string data.