I’m currently learning about data visualization using seaborn, and I came across a problem that I couldn’t find a solution to.
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
So I have this data
index | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 |
---|---|---|---|---|---|---|---|---|
1990 | 0 | 4 | 7 | 3 | 7 | 0 | 6 | 6 |
1991 | 1 | 7 | 5 | 0 | 8 | 1 | 8 | 4 |
1992 | 0 | 5 | 0 | 1 | 9 | 1 | 7 | 2 |
1993 | 2 | 7 | 0 | 0 | 6 | 1 | 2 | 7 |
1994 | 4 | 1 | 5 | 5 | 8 | 1 | 6 | 3 |
1995 | 7 | 0 | 6 | 4 | 8 | 0 | 5 | 7 |
1996 | 5 | 1 | 1 | 4 | 6 | 1 | 7 | 4 |
1997 | 0 | 4 | 7 | 5 | 5 | 1 | 8 | 5 |
1998 | 1 | 3 | 7 | 0 | 7 | 0 | 7 | 1 |
1999 | 5 | 7 | 1 | 1 | 6 | 0 | 8 | 5 |
2000 | 3 | 8 | 5 | 0 | 3 | 0 | 6 | 3 |
2001 | 6 | 0 | 4 | 1 | 7 | 1 | 2 | 7 |
I want to make barplots/histplots with col1, col2 .. col8 as one column and 1990 values as one column so like 1990;
col? | val |
---|---|
col1 | 0 |
col2 | 4 |
col3 | 7 |
col4 | 3 |
col5 | 7 |
col6 | 0 |
col7 | 6 |
col8 | 6 |
and plot them for each year from 1990 to 2001.
g = sns.FacetGrid(df, col=df.index.value_counts()) g.map(sns.histplot, df.columns)
This is the code that I’ve written I looked at facetgrid but could get it working for my case, any feedback is appreciated.
Advertisement
Answer
Imports and Test DataFrame
- Tested with
pandas 1.3.0
,matplotlib 3.4.2
, andseaborn 0.11.1
import pandas as pd import seaborn as sns # sample dataframe data = {1990: {'col1': 0, 'col2': 4, 'col3': 7, 'col4': 3, 'col5': 7, 'col6': 0, 'col7': 6, 'col8': 6}, 1991: {'col1': 1, 'col2': 7, 'col3': 5, 'col4': 0, 'col5': 8, 'col6': 1, 'col7': 8, 'col8': 4}, 1992: {'col1': 0, 'col2': 5, 'col3': 0, 'col4': 1, 'col5': 9, 'col6': 1, 'col7': 7, 'col8': 2}, 1993: {'col1': 2, 'col2': 7, 'col3': 0, 'col4': 0, 'col5': 6, 'col6': 1, 'col7': 2, 'col8': 7}, 1994: {'col1': 4, 'col2': 1, 'col3': 5, 'col4': 5, 'col5': 8, 'col6': 1, 'col7': 6, 'col8': 3}, 1995: {'col1': 7, 'col2': 0, 'col3': 6, 'col4': 4, 'col5': 8, 'col6': 0, 'col7': 5, 'col8': 7}, 1996: {'col1': 5, 'col2': 1, 'col3': 1, 'col4': 4, 'col5': 6, 'col6': 1, 'col7': 7, 'col8': 4}, 1997: {'col1': 0, 'col2': 4, 'col3': 7, 'col4': 5, 'col5': 5, 'col6': 1, 'col7': 8, 'col8': 5}, 1998: {'col1': 1, 'col2': 3, 'col3': 7, 'col4': 0, 'col5': 7, 'col6': 0, 'col7': 7, 'col8': 1}, 1999: {'col1': 5, 'col2': 7, 'col3': 1, 'col4': 1, 'col5': 6, 'col6': 0, 'col7': 8, 'col8': 5}, 2000: {'col1': 3, 'col2': 8, 'col3': 5, 'col4': 0, 'col5': 3, 'col6': 0, 'col7': 6, 'col8': 3}, 2001: {'col1': 6, 'col2': 0, 'col3': 4, 'col4': 1, 'col5': 7, 'col6': 1, 'col7': 2, 'col8': 7}} df = pd.DataFrame.from_dict(data, orient='index') # display(df.head()) col1 col2 col3 col4 col5 col6 col7 col8 1990 0 4 7 3 7 0 6 6 1991 1 7 5 0 8 1 8 4 1992 0 5 0 1 9 1 7 2 1993 2 7 0 0 6 1 2 7 1994 4 1 5 5 8 1 6 3
Plotting with seaborn.catplot
- Using
seaborn 0.11.1
, the correct way to create abarplot
FacetGrid (per the documentation), is withsns.catplot
andkind='bar'
. - It is required to convert the dataframe from a wide to long form, which is easily done by resetting the index, and then using
pandas.DataFrame.melt()
- A
catplot
is a figure-level interface for drawing categorical plots onto a FacetGrid.g.set_xticklabels(rotation=90)
can be used to rotate the xticklabels.- See How to rotate xticklabels in a seaborn catplot or How to set rotation for seaborn FacetGrid and figure-level xtick labels
# convert the wide dataframe to a long format with melt dfm = df.reset_index().melt(id_vars='index', var_name='variable', value_name='value') # display(dfm.head()) index variable value 0 1990 col1 0 1 1991 col1 1 2 1992 col1 0 3 1993 col1 2 4 1994 col1 4 # plot with catplot and kind='bar' g = sns.catplot(data=dfm, kind='bar', col='index', col_wrap=4, x='variable', y='value', height=3) # change the ticklabel rotation if needed g.set_xticklabels(rotation=90) # change ylim if needed g.set(ylim=(0, 30))
Plotting with pandas.DataFrame.plot
- While you have asked about
seaborn
, given the dataframe in the OP with all the years in the index, the easiest way to plot the data is transpose the dataframe with.T
, and then usepandas.DataFrame.plot
# display(df.T.head()) 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 col1 0 1 0 2 4 7 5 0 1 5 3 6 col2 4 7 5 7 1 0 1 4 3 7 8 0 col3 7 5 0 0 5 6 1 7 7 1 5 4 col4 3 0 1 0 5 4 4 5 0 1 0 1 col5 7 8 9 6 8 8 6 5 7 6 3 7 # transpose and plot axes = df.T.plot(kind='bar', subplots=True, layout=[3, 4], figsize=(15, 7), legend=False, rot=0) # to change ylim of the subplots, if needed for ax in axes.flatten(): ax.set_ylim(0, 30)