Skip to content
Advertisement

Openpyxl to create dataframe with sheet name and specific cell values?

What I need to do:

  • Open Excel Spreadsheet in Python/Pandas
  • Create df with [name, balance]

Example:

name balance
Jones Ministry 45,408.83
Smith Ministry 38,596.20
Doe Ministry 28,596.20

What I have done so far…

JavaScript

Then…

JavaScript

I viewed all the sheet names by…

JavaScript

And created a dataframe with the ‘name’ column

JavaScript

My spreadsheet looks like this…

sample spreadsheet

I now need to loop thru each sheet and add the ‘ending fund balance’ and corresponding ‘value’

  • The “Ending Fund Balance” is at different rows, but always the final row. The ‘value’ is always in column ‘G’

How do I go about doing this?

I have read through examples in:

  • Automate the Boring Stuff
  • Openpyxl documentation
  • PBPython.com examples
  • Stack Overflow questions

I appreciate your help!

Working samples on github: Github: JohnMillstead: Balance_Study

Advertisement

Answer

To ge a cell value first set the data_only=True on load_workbook, otherwise you could end up getting the cell formula. To get last row of a worksheet you can use ws.max_row. Combine the previous command with the already created dataframe and apply for each worksheet name a function to get the last value from that worksheet at the G column (wb[x][f'G{wb[x].max_row}']).

JavaScript

Output from df

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement