I’m a beginner and I’m sorry if this is completely wrong. So far, I’ve been able to present the fields required (author, subreddit, date created, number of comments, score, submission title, submission description) as well as save this into a dataframe. But I’m suddenly lost when the complicated questions begin such as this one and which day of the week has the most submissions. This is what I have right now for getting the submission with the highest score:
subreddit = pd.read_csv('subreddit.csv', delimiter = ',') subreddit.count() score = "score" h_score = subreddit.score.max() best_submission = subreddit.score(h_score) #it comes out as TypeError: 'Series' object is not callable here bsubmission_title = title[best_submission] print("Submission with the highest score:", bsubmission_title)
Advertisement
Answer
subreddit.score.max()
returns the highest value in the score
column. But you want to get the title that is on the same row as that score. In order to get that you do not need the score value, but the index of the row with the highest score value. You can get this with idxmax
. You can then use the index to get the matching title:
h_score_index = subreddit.score.idxmax() bsubmission_title = subreddit.title[h_score_index] print("Submission with the highest score:", bsubmission_title)