What is as_index in groupby in pandas?

Question

What exactly is the function of as_index in groupby in Pandas? Answer print() is your friend when you don&#8217;t understand a thing. It clears out doubts many times. Take a look: Output: When as_index=True the key(s) you use in groupby() will become an index in the new dataframe. The benefits you get when yo…

Accepted Answer

print() is your friend when you don&#8217;t understand a thing. It clears out doubts many times.Take a look:import pandas as pddf = pd.DataFrame(data={'books':['bk1','bk1','bk1','bk2','bk2','bk3'], 'price': [12,12,12,15,15,17]})print(df)print(df.groupby('books', as_index=True).sum())print(df.groupby('books', as_index=False).sum())Output:  books  price0   bk1     121   bk1     122   bk1     123   bk2     154   bk2     155   bk3     17       pricebooks       bk1       36bk2       30bk3       17  books  price0   bk1     361   bk2     302   bk3     17When as_index=True the key(s) you use in groupby() will become an index in the new dataframe.The benefits you get when you set the column as index are:Speed. When you filter values based on the index column eg. df.loc['bk1'], it would be faster because of hashing of index column. It doesn&#8217;t have to traverse the entire books column to find 'bk1'. It will just calculate the hash value of 'bk1' and find it in 1 go.Ease. When as_index=True you can use this syntax df.loc['bk1'] which is shorter and faster as opposed to df.loc[df.books=='bk1'] which is longer and slower.

Advertisement

Answer