So I am a newbie learning about data visualization in pandas (python) , My task is to Create a stacked chart of median WeekHrs and CodeRevHrs for the age group 30 to 35.
following is my code where I extracted the data applying filter on age column and below are the first five rows of my dataset
age_filter= agework [(agework["age"]>= 30 )&(agework["age"]<=35)] median_weekhrs= age_filter["Weekhrs"].median() median_coderev= age_filter["CodeRevHrs"].median() age_filter.head()
CodeRevHrs Weekhrs age 5 3.0 8.0 31.0 11 2.0 40.0 34.0 12 2.0 40.0 32.0 18 15.0 42.0 34.0 22 2.0 40.0 33.0
How can I plot a stacked bar chart with a median?
Please help
Advertisement
Answer
First, to filter for age (and also convert age to int
as it makes for cleaner labels):
df = agework.query('30 <= age <= 35') df['age'] = df['age'].astype(int)
Then, you could plot a bar chart of the median of the two quantities in each age group:
df.groupby('age').median().plot.bar(stacked=True) plt.title('Median hours, by age')
BTW, you can impose an arbitrary order in how the values are stacked. For example, if you’d rather have 'Weekhrs'
at the bottom, you can say:
order = ['Weekhrs', 'CodeRevHrs'] df.groupby('age')[order].median().plot.bar(stacked=True) plt.title('Median hours, by age')
Now, if you’d rather plot the overall median of these quantities for the entire filtered age range (as you say: a single number for each quantity), then one way (among many) would be:
label = f"{df['age'].min()}-{df['age'].max()}" df.median().drop('age').to_frame(label).T.plot.bar(stacked=True) plt.title(f'Median hours for age {label}')