Skip to content
Advertisement

Pandas get topmost n records within each group

Suppose I have pandas DataFrame like this:

JavaScript

which looks like:

JavaScript

I want to get a new DataFrame with top 2 records for each id, like this:

JavaScript

I can do it with numbering records within group after groupby:

JavaScript

which looks like:

JavaScript

then for the desired output:

JavaScript

Output:

JavaScript

But is there more effective/elegant approach to do this? And also is there more elegant approach to number records within each group (like SQL window function row_number()).

Advertisement

Answer

Did you try

JavaScript

Output generated:

JavaScript

(Keep in mind that you might need to order/sort before, depending on your data)

EDIT: As mentioned by the questioner, use

JavaScript

to remove the MultiIndex and flatten the results:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement