I have tried the following code and it works however it shows excess columns that I don’t require. This is the output showing the extra columns:
JavaScript
x
9
1
import pandas as pd
2
df = pd.read_csv("data.csv")
3
df = df.groupby(['City1', 'City2']).sum('PassengerTrips')
4
df['Vacancy'] = 1-df['PassengerTrips'] / df['Seats']
5
df = df.groupby(['City1','City2']).max('Vacancy')
6
df = df.sort_values('Vacancy', ascending =False)
7
print('The 10 routes with the highest proportion of vacant seats:')
8
print(df[:11])
9
I have tried to add the following code in after sorting the vacancy values however it gives me an error:
JavaScript
1
2
1
df = df[['City1', 'City2', 'Vacancy']]
2
Advertisement
Answer
City1
and City2
are in index since you applied a groupby
on it.
You can put those in columns using reset_index
to get the expected result :
JavaScript
1
3
1
df = df.reset_index(drop=False)
2
df = df[['City1', 'City2', 'Vacancy']]
3
Or, if you want to let City1
and City2
in index, you can do as @Corralien said in his comment : df = df['Vacancy']
And even df = df['Vacancy'].to_frame()
to get a DataFrame
instead of a Serie
.