I have this excel file that I would like to “normalize” with the below code by dropping the A and B columns and then dropping the Rows 1,2,3,5 and making the row 4 as header.
JavaScript
x
6
1
df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly")
2
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
3
df_weekly = df_weekly.drop(df_weekly.index[[0,1,2,4]])
4
5
print(df_weekly.head(8))
6
But I keep getting this weird df that I dont understand the reason why
JavaScript
1
10
10
1
Unnamed: 2 Unnamed: 3 Unnamed: 53 Unnamed: 54
2
3 NaN NaN NaN NaN
3
5 EU01 10 5400537629432 NaN
4
6 EU01 10 5400599118677 NaN
5
7 EU01 10 5400599027382 NaN
6
8 EU01 10 5400599028730 NaN
7
9 EU01 10 5400599028761 NaN
8
10 EU01 10 5400599613554 NaN
9
11 EU01 10 5400599028945
10
Also:
JavaScript
1
5
1
top= df_weekly.columns.values.tolist()
2
print(top)
3
4
['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8', .]
5
Advertisement
Answer
You must explicitely set the column names:
JavaScript
1
8
1
df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly",
2
header=None)
3
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
4
df_weekly.columns = df_weekly.loc[3].rename(None)
5
df_weekly = df_weekly.drop(range(5))
6
7
print(df_weekly.head(8))
8