Skip to content
Advertisement

How can I read a range(‘A5:B10’) and place these values into a dataframe using openpyxl

Being able to define the ranges in a manner similar to excel, i.e. ‘A5:B10’ is important to what I need so reading the entire sheet to a dataframe isn’t very useful.

So what I need to do is read the values from multiple ranges in the Excel sheet to multiple different dataframes.

JavaScript

or

JavaScript

I have searched but either I have done a very poor job of searching or everyone else has gotten around this problem but I really can’t.

Thanks.

Advertisement

Answer

Using openpyxl

Since you have indicated, that you are looking into a very user friendly way to specify the range (like the excel-syntax) and as Charlie Clark already suggested, you can use openpyxl.

The following utility function takes a workbook and a column/row range and returns a pandas DataFrame:

JavaScript

Usage:

JavaScript

Output:

JavaScript

Pandas only Solution

Given the following data in an excel sheet:

JavaScript

You can load it with the following command: pd.read_excel('excel-sheet.xlsx')

If you were to limit the data being read, the pandas.read_excel method offers a number of options. Use the parse_cols, skiprows and skip_footer to select the specific subset that you want to load:

JavaScript

Output:

JavaScript

Some notes:

The API of the read_excel method is not meant to support more complex selections. In case you require a complex filter it is much easier (and cleaner) to load the whole data into a DataFrame and use the excellent slicing and indexing mechanisms provided by pandas.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement