Skip to content
Advertisement

How can I find the top three highest rows based on a column in a csv?

I’m trying to iterate through each line of a csv and bring back the top 3 highscores. There’s only 2 columns, one named ‘users’ and the other ‘highscores’. I know what I have so far isn’t much but I’m completely stumped. I feel like I could get the highest score by storing the value and iterate over each line and then replace it if it’s less than a number, but I’m not sure what to do If I want the top three lines.

This is how I began:

import csv
a = open('highscoreslist.csv')
spreadsheet = csv.DictReader(a)
names = []
scores = []

for row in speadsheet:
  names.append(row['users'])
  scores.append(row['highscores'])
  

And now I just don’t know what direction to take. I was going to put them in two lists and then find the highest that way, but they’re already in a dictionary so that my be pointless. I’m also trying to learn this concept, so I would prefer not to do it in Pandas.

Advertisement

Answer

This answer illustrates a Python solution to your problem and how things roll with Python.

sorted([(row['users'], row['highscores']) for row in csv.DictReader(a)],  key=lambda t: t[1], reverse=True)[:3]

You are asking for the three highest scores, which presumably you wish to also know the user otherwise this becomes trivially easy.

Given your code, the primary issue you are having is that you have placed names and scores in two independent data structures.

So get the data into the same datastructure. For that, you can loop over the dict rows of the CSV from the DictReader. Change those dicts into tuples. And don’t use a loop, use a list comprehension.

[(row['users'], row['highscores']) for row in csv.DictReader(a)]

You will get [(1,2),(6,5),(6,7)] etc.

Then use a ‘lambda’ this is a small function that can be passed into other functions, like sorted().

sorted() works on a collection that is passed in, such as the above list of tuples. And, it takes some very useful arguments, not the least of which is a key to sort by. So, we can get fancy and use the lambda to specify which key we wish sorting to occur by.

sorted([(1,2),(6,5),(6,7)], key= the lambda) and lambda t: t[1] says return the second element of each tuple (the user’s high score). So then sorted() sorts by highscore.

And then you slice your result by list slicing, another super cool python thing. [:3] says give me the first three elements – and sorted from high to low gives you the top three scores, and their users, because you sorted(...,reverse=True).

You can than access the resultant list of tuples to show the high scores of a game you are hoping to make/automate!

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement