Skip to content
Advertisement

Why do Pandas dataframe’s data types change after exporting into a CSV file

I did export the following dataframe in Google Colab. Whichever method I used, when I import it later, my dataframe appears as pandas.core.series.Series, not as an array.

enter image description here

from google.colab import drive

drive.mount('/content/drive')
path = '/content/drive/My Drive/output.csv'

with open(path, 'w', encoding = 'utf-8-sig') as f:
  df_protein_final.to_csv(f)

After importing the dataframe looks like below

enter image description here

pandas.core.series.Series

Note: The first image and second image can be different order in terms of numbers (It can be look as a different dataset). Please don’t get hung up on this. Don’t worry. Those images are just an example.

Why does column, which is originally an array before exporting, converts to series after exporting?

The code below gives the same result. Can’t export original structure.

from google.colab import files
df.to_csv('filename.csv') 
files.download('filename.csv')

Edit: I am looking for a solution is there any way to keep original structure (e.g. array) while exporting.

Advertisement

Answer

Actually that is how pandas work. When you try to insert a list or an numpy array into a pandas dataframe, it converts that array to a series always. If you want to turn the series back to a list/array use Series.values, Series.array or Series.to_numpy() . refer this

EDIT :

I got an idea from your comments. You are asking to save dframe into a file while preserving its all properties. You are actually (intentionally or unintentionally) asking how to SERIALIZE the data frame. You have to use pickle for this. Refer this

Note : Pandas has inbuilt pickle support. So you can directly export dframe into pickle file like in this example

df.to_pickle(file_name) 
Advertisement