Skip to content
Advertisement

How can I show only some columns using Python Pandas?

I have tried the following code and it works however it shows excess columns that I don’t require. This is the output showing the extra columns: enter image description here

    import pandas as pd
df = pd.read_csv("data.csv")
df = df.groupby(['City1', 'City2']).sum('PassengerTrips')
df['Vacancy'] = 1-df['PassengerTrips'] / df['Seats']
df = df.groupby(['City1','City2']).max('Vacancy')
df = df.sort_values('Vacancy', ascending =False)
print('The 10 routes with the highest proportion of vacant seats:')
print(df[:11])

I have tried to add the following code in after sorting the vacancy values however it gives me an error:

df = df[['City1', 'City2', 'Vacancy']]

Advertisement

Answer

City1 and City2 are in index since you applied a groupby on it. You can put those in columns using reset_index to get the expected result :

df = df.reset_index(drop=False)
df = df[['City1', 'City2', 'Vacancy']]

Or, if you want to let City1 and City2 in index, you can do as @Corralien said in his comment : df = df['Vacancy']

And even df = df['Vacancy'].to_frame() to get a DataFrame instead of a Serie.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement