Skip to content
Advertisement

df.drop does not work properly, pandas python

I have this excel file that I would like to “normalize” with the below code by dropping the A and B columns and then dropping the Rows 1,2,3,5 and making the row 4 as header.

enter image description here

df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly")
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
df_weekly = df_weekly.drop(df_weekly.index[[0,1,2,4]])

print(df_weekly.head(8))

But I keep getting this weird df that I dont understand the reason why

Unnamed: 2 Unnamed: 3  ...    Unnamed: 53 Unnamed: 54
3         NaN        NaN  ...            NaN         NaN
5        EU01         10  ...  5400537629432         NaN
6        EU01         10  ...  5400599118677         NaN
7        EU01         10  ...  5400599027382         NaN
8        EU01         10  ...  5400599028730         NaN
9        EU01         10  ...  5400599028761         NaN
10       EU01         10  ...  5400599613554         NaN
11       EU01         10  ...  5400599028945       

Also:

top= df_weekly.columns.values.tolist()
print(top)

['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4', 'Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 8',....]

Advertisement

Answer

You must explicitely set the column names:

df_weekly = pd.read_excel('Run_report.xlsm', sheet_name= "weekly",
                          header=None)
df_weekly = df_weekly.drop(df_weekly.columns[[0,1]], axis=1)
df_weekly.columns = df_weekly.loc[3].rename(None)
df_weekly = df_weekly.drop(range(5))

print(df_weekly.head(8))
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement