In some circumstances the format (int, float, etc) of a cell is lost when accessing via its row.
In that example the first column has integers and the second floats. But the 111
is converted into 111.0
.
dfA = pandas.DataFrame({ 'A': [111, 222, 333], 'B': [1.3, 2.4, 3.5], }) # A 111.0 # B 1.3 # Name: 0, dtype: float64 print(dfA.loc[0]) # <class 'numpy.float64'> print(type(dfA.loc[0].A))
The output I would expect is like this
A 111 B 1.3 <class 'numpy.int64'>
I have an idea why this happens. But IMHO this isn’t user friendly. Can I solve this somehow? The goal is to access (e.g. read) each cells value without loseing its format.
In the full code below you can also see it is possible when one of the columns is of type string. Wired.
Minimal Working Example
#!/usr/bin/env python3 import pandas dfA = pandas.DataFrame({ 'A': [111, 222, 333], 'B': [1.3, 2.4, 3.5], }) print(dfA) dfB = pandas.DataFrame({ 'A': [111, 222, 333], 'B': [1.3, 2.4, 3.5], 'C': ['one', 'two', 'three'] }) print(dfB) print(dfA.loc[0]) print(type(dfA.loc[0].A)) print(dfB.loc[0]) print(type(dfB.loc[0].A))
Output
A B 0 111 1.3 1 222 2.4 2 333 3.5 A B C 0 111 1.3 one 1 222 2.4 two 2 333 3.5 three A 111.0 B 1.3 Name: 0, dtype: float64 <class 'numpy.float64'> A 111 B 1.3 C one Name: 0, dtype: object <class 'numpy.int64'>
Advertisement
Answer
Since you only need to read rows, here is one way to retain the formatting of a Series with mixed types using Pandas astype and object:
dfA = dfA.astype(object)
Then:
print(dfA.loc[0, :]) # Output A 111 B 1.3 Name: 0, dtype: object