Skip to content
Advertisement

How to get last group in Pandas’ groupBy?

I wish to get the last group of my group by:

df.groupby(pd.TimeGrouper(freq='M')).groups[-1]:

but that gives the error:

KeyError: -1

Using get_group is useless as I don’t know the last group’s value (unless there’s a specific way to get that value?). Also I might want to get the last 2 groups, etc

How do I do this?

Advertisement

Answer

You can call last which computes the last values for each group and use iloc to get the row values and access the index group values using the name attribute, there is probably a better way but unable to figure this out yet:

In [170]:
# dummy data
df = pd.DataFrame({'a':['1','2','2','4','5','2'], 'b':np.random.randn(6)})
df
Out[170]:
   a         b
0  1  0.097176
1  2 -1.400536
2  2  0.352093
3  4 -0.696436
4  5 -0.308680
5  2 -0.217767
In [179]:

gp = df.groupby('a', sort=False)
gp.get_group(df.groupby('a').last().iloc[-1].name)
Out[179]:
   a         b
4  5  0.608724
In [180]:

df.groupby('a').last().iloc[-2:]
Out[180]:
          b
a          
4  0.390451
5  0.608724
In [181]:

mult_groups = gp.last().iloc[-2:].index
In [182]:

for gp_val in mult_groups:
    print(gp.get_group(gp_val))
   a         b
3  4  0.390451
   a         b
4  5  0.608724
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement