I have a dataframe that’s about 750 odd rows, which is obviously excessive for a traditional bar chart. What I would like to do is have it display the Top 5 entries, the specific criteria that I’m looking for, and the bottom 5 entries.
The dataframe is something like this:
Animal Score
0 Dog 1
1 Pig 2
2 Chicken 3
3 Cat 4
4 Fox 5
What I want to be able to do is display the bar graph so it selects the Top 5 (where 1 is highest) entries (Dog, Pig, Chicken, Cat, Fox) and then a specific entry in the list (the beetle) and then the bottom 5 entries (Tiger, Lion, Zebra, Gorilla, Seahorse) and the corresponding ‘score’ for those entries.
Is this even possible?
Thanks in advance for your help!
Update: The original answer gave me additional entries on the right-side of the graph, no matter how many times I modified the concat field. Perhaps due to the 750 entries in my df? Answer: NaN values were giving me blank values, which is the reason. These have been taken into account by me.
Advertisement
Answer
# Importing Dependencies
import pandas as pd
import matplotlib.pyplot as plt
# Creating DataFrame
data = {'Animal': ['Dog', 'Pig', 'Chicken','Cat','Fox','Hawk','Dolphin','Squirrel','Lizard','Shark','Ant','Beetle','Termite',
'Kangaroo','Opossum','Whale','Jellyfish','Seahorse','Gorilla','Zebra','Lion','Tiger'], 'Score': [_ for _ in range(23) if _ != 0]}
df = pd.DataFrame(data)
#Add custom entries here
custom_df = df.loc[df['Animal'].isin(['Beetle', 'Ant'])]
# Create new df
new_df = pd.concat([df[:5], df[-5:], custom_df])
new_df.sort_values(by=['Score'], inplace= True)
new_df.reset_index(drop=True, inplace= True)
# Creating the bar plot
plt.bar(new_df['Animal'], new_df['Score'])
plt.xticks(rotation=90)
plt.xlabel("Animals")
plt.ylabel("Score")
plt.show()