Is there a more efficient way to find and downgrade int64 columns with to_numeric() in Python Pandas?

Question

tl;dr: Need help cleaning up my downcast_int(df) function below. Hello, I&#8217;m trying to write my own downcasting functions to save memory usage. I am curious about alternatives to my (frankly, quite messy, but functioning) code, to make it more readable &#8211; and, perhaps, faster. The downcasting functi…

Accepted Answer

Just apply to_numeric() twice. Once to get to min signed, then a second time to reduce the unsigned.df2 = df.select_dtypes(include=[np.number]).apply(pd.to_numeric, downcast='signed')df2 = df2.select_dtypes(include=[np.number]).apply(pd.to_numeric, downcast='unsigned')df[df2.columns] = df2Same output as your method: #   Column  Non-Null Count  Dtype  ---  ------  --------------  -----   0   first   2 non-null      uint32  1   second  2 non-null      int32   2   third   2 non-null      object  3   fourth  2 non-null      float64 4   fifth   2 non-null      int8   dtypes: float64(1), int32(1), int8(1), object(1), uint32(1)

Is there a more efficient way to find and downgrade int64 columns with to_numeric() in Python Pandas?

tl;dr: Need help cleaning up my downcast_int(df) function below.

Example df

df.info()

Downcasting function

df.info() after downcasting

Advertisement

Answer