Skip to content
Advertisement

Pandas split group into first and last values

I have multiple dataframes with a structure that starts with 10 and then goes from 0 to 10:

Type  Value
 10  0.7666
 10  0.6566
 10  0.7666
 0   0.7666
 0   0.5446
 1   0.7866
 2   0.7695
 2   0.1642
 3   0.1646
  .....
 9   0.1476
 9   0.4224
 10  0.5446
 10  0.6566

So far I’ve been using this code to group the dataframe by type:

grouped = df.groupby(['Type'])
result = grouped.get_group(10)

It works fine for types 0-9, but I’d also like to split type 10 into 2 groups to distinguish first and last values instead of having it all in a single dataframe like this:

Type  Value
 10  0.7666
 10  0.6566
 10  0.7666
 10  0.5446
 10  0.6566

Advertisement

Answer

Create groups for consecutive groups and then for selecting use tuple:

g = df['Type'].ne(df['Type'].shift()).cumsum()
g = g.groupby(df['Type']).rank('dense')

grouped = df.groupby(['Type',  g])
result = grouped.get_group((10, 1))
print (result)
   Type   Value
0    10  0.7666
1    10  0.6566
2    10  0.7666

result = grouped.get_group((10, 2))
print (result)
    Type   Value
11    10  0.5446
12    10  0.6566
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement