I have a DataFrame with two columns: a column of int and a column of str.
- I understand that if I insert NaNinto theintcolumn, Pandas will convert all theintintofloatbecause there is noNaNvalue for anint.
- However, when I insert Noneinto thestrcolumn, Pandas converts all myinttofloatas well. This doesn’t make sense to me – why does the value I put in column 2 affect column 1?
Here’s a simple working example):
import pandas as pd df = pd.DataFrame() df["int"] = pd.Series([], dtype=int) df["str"] = pd.Series([], dtype=str) df.loc[0] = [0, "zero"] print(df) print() df.loc[1] = [1, None] print(df)
The output is:
int str 0 0 zero int str 0 0.0 zero 1 1.0 NaN
Is there any way to make the output the following:
int str 0 0 zero int str 0 0 zero 1 1 NaN
without recasting the first column to int.
- I prefer using - intinstead of- floatbecause the actual data in that column are integers. If there’s not workaround, I’ll just use- floatthough.
- I prefer not having to recast because in my actual code, I don’t 
 store the actual- dtype.
- I also need the data inserted row-by-row. 
Advertisement
Answer
If you set dtype=object, your series will be able to contain arbitrary data types:
df["int"] = pd.Series([], dtype=object) df["str"] = pd.Series([], dtype=str) df.loc[0] = [0, "zero"] print(df) print() df.loc[1] = [1, None] print(df) int str 0 0 zero 1 NaN NaN int str 0 0 zero 1 1 None