I want to change ‘not available’ value in a df column into 0, and for the rest of the values to change them into integers.
Unique values in the column are:
JavaScript
x
2
1
['30', 'not available', '45', '60', '40', '90', '21', '5','75','29', '8', '10']
2
I run the following code to change values to integers:
JavaScript
1
2
1
df[col] = np.where(df[col] == 'not available',0,df[col].astype(int))
2
I expect that the above would turn all values into integers, yet I get the value error
JavaScript
1
2
1
ValueError: invalid literal for int() with base 10: 'not available'
2
Any suggestion why the code does not work?
Advertisement
Answer
Before doing
JavaScript
1
2
1
df[col] = np.where(df[col] == 'not available',0,df[col].astype(int))
2
it is neccessary to compute
JavaScript
1
4
1
df[col] == 'not available'
2
0
3
df[col].astype(int)
4
Latter meaning int
version for all which fails, as not available
does not make sense as integer, you might avoid this problem by using pandas.Series.apply
combined with lambda
holding ternary operator as follows
JavaScript
1
6
1
import pandas as pd
2
df = pd.DataFrame({"col1":['30', 'not available', '45', '60', '40', '90', '21', '5','75','29', '8', '10']})
3
col = "col1"
4
df[col] = df[col].apply(lambda x:0 if x=='not available' else int(x))
5
print(df)
6
output
JavaScript
1
14
14
1
col1
2
0 30
3
1 0
4
2 45
5
3 60
6
4 40
7
5 90
8
6 21
9
7 5
10
8 75
11
9 29
12
10 8
13
11 10
14
This way int
is applied only to record which is not equal 'not available'