I have tried the following code and it works however it shows excess columns that I don’t require. This is the output showing the extra columns:
import pandas as pd df = pd.read_csv("data.csv") df = df.groupby(['City1', 'City2']).sum('PassengerTrips') df['Vacancy'] = 1-df['PassengerTrips'] / df['Seats'] df = df.groupby(['City1','City2']).max('Vacancy') df = df.sort_values('Vacancy', ascending =False) print('The 10 routes with the highest proportion of vacant seats:') print(df[:11])
I have tried to add the following code in after sorting the vacancy values however it gives me an error:
df = df[['City1', 'City2', 'Vacancy']]
Advertisement
Answer
City1
and City2
are in index since you applied a groupby
on it.
You can put those in columns using reset_index
to get the expected result :
df = df.reset_index(drop=False) df = df[['City1', 'City2', 'Vacancy']]
Or, if you want to let City1
and City2
in index, you can do as @Corralien said in his comment : df = df['Vacancy']
And even df = df['Vacancy'].to_frame()
to get a DataFrame
instead of a Serie
.