How can I convert np.nan
into the new pd.NA
format, given the pd.DataFrame
comprises float
?
JavaScript
x
11
11
1
import numpy as np
2
import pandas as pd
3
4
df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
5
df.iloc[0, 1] = 1.5
6
df.iloc[3, 0] = 4.7
7
8
df = df.convert_dtypes()
9
10
type(df.iloc[0, 0]) # numpy.float64 - I'am expecting pd.NA
11
Making use of pd.convert_dtypes()
doesn’t seem to work when df
comprises float
. This conversion is however working fine when df
contains int
.
Advertisement
Answer
From v1.2 this now works with floats by default and if you want integer use convert_floating=False
parameter.
JavaScript
1
10
10
1
import numpy as np
2
import pandas as pd
3
4
df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
5
df.iloc[0, 1] = 1.5
6
df.iloc[3, 0] = 4.7
7
8
df = df.convert_dtypes()
9
df.info()
10
output
JavaScript
1
11
11
1
<class 'pandas.core.frame.DataFrame'>
2
Int64Index: 4 entries, 0 to 3
3
Data columns (total 2 columns):
4
# Column Non-Null Count Dtype
5
--- ------ -------------- -----
6
0 A 1 non-null Float64
7
1 B 1 non-null Float64
8
dtypes: Float64(2)
9
memory usage: 104.0 bytes
10
11
Working with ints
JavaScript
1
10
10
1
import numpy as np
2
import pandas as pd
3
4
df = pd.DataFrame(np.nan, index=[0, 1, 2, 3], columns=['A', 'B'])
5
df.iloc[0, 1] = 1
6
df.iloc[3, 0] = 4
7
8
df = df.convert_dtypes(convert_floating=False)
9
df.info()
10
output
JavaScript
1
10
10
1
<class 'pandas.core.frame.DataFrame'>
2
Int64Index: 4 entries, 0 to 3
3
Data columns (total 2 columns):
4
# Column Non-Null Count Dtype
5
--- ------ -------------- -----
6
0 A 1 non-null Int64
7
1 B 1 non-null Int64
8
dtypes: Int64(2)
9
memory usage: 104.0 bytes
10