Skip to content
Advertisement

Faster way to read Excel files to pandas dataframe

I have a 14MB Excel file with five worksheets that I’m reading into a Pandas dataframe, and although the code below works, it takes 9 minutes!

Does anyone have suggestions for speeding it up?

JavaScript

Advertisement

Answer

As others have suggested, csv reading is faster. So if you are on windows and have Excel, you could call a vbscript to convert the Excel to csv and then read the csv. I tried the script below and it took about 30 seconds.

JavaScript

Here’s a little snippet of python to create the ExcelToCsv.vbs script:

JavaScript

This answer benefited from Convert XLS to CSV on command line and csv & xlsx files import to pandas data frame: speed issue

Advertisement