I want to read a .xlsx file using the Pandas Library of python and port the data to a postgreSQL table.
All I could do up until now is:
import pandas as pd data = pd.ExcelFile("*File Name*")
Now I know that the step got executed successfully, but I want to know how i can parse the excel file that has been read so that I can understand how the data in the excel maps to the data in the variable data.
I learnt that data is a Dataframe object if I’m not wrong. So How do i parse this dataframe object to extract each line row by row.
Advertisement
Answer
I usually create a dictionary containing a DataFrame
for every sheet:
xl_file = pd.ExcelFile(file_name) dfs = {sheet_name: xl_file.parse(sheet_name) for sheet_name in xl_file.sheet_names}
Update: In pandas version 0.21.0+ you will get this behavior more cleanly by passing sheet_name=None
to read_excel
:
dfs = pd.read_excel(file_name, sheet_name=None)
In 0.20 and prior, this was sheetname
rather than sheet_name
(this is now deprecated in favor of the above):
dfs = pd.read_excel(file_name, sheetname=None)