Say I have the following Excel file:
JavaScript
x
7
1
A B C
2
0 - - -
3
1 Start - -
4
2 3 2 4
5
3 7 8 4
6
4 11 2 17
7
I want to read the file in a dataframe making sure that I start to read it below the row where the Start
value is.
Attention: the Start
value is not always located in the same row, so if I were to use:
JavaScript
1
4
1
import pandas as pd
2
xls = pd.ExcelFile('C:UsersMyFolderMyFile.xlsx')
3
df = xls.parse('Sheet1', skiprows=4, index_col=None)
4
this would fail as skiprows
needs to be fixed. Is there any workaround to make sure that xls.parse
finds the string value instead of the row number?
Advertisement
Answer
JavaScript
1
2
1
df = pd.read_excel('your/path/filename')
2
This answer helps in finding the location of ‘start’ in the df
JavaScript
1
9
1
for row in range(df.shape[0]):
2
3
for col in range(df.shape[1]):
4
5
if df.iat[row,col] == 'start':
6
7
row_start = row
8
break
9
after having row_start you can use subframe of pandas
JavaScript
1
2
1
df_required = df.loc[row_start:]
2
And if you don’t need the row containing ‘start’, just u increment row_start by 1
JavaScript
1
2
1
df_required = df.loc[row_start+1:]
2