Surprisingly little info out there regarding python and the pyalluvial package. I’m hoping to combine stacked bars and a corresponding alluvial in the same figure.
Using below, I have three unique groups, which is outlined in Group
. I want to display the proportion of each Group
for each unique Point
. I have the data formatted this way as I need three separate stacked bar charts for each Point
.
So overall (Ove
) highlight the overall proportion taken from all three Points
. Group 1
makes up 70%, Group 2
makes up 20%, Group 3
makes up 10%. But the proportion of each group changes at different intervals Points
. I’m hoping to show this like a standard stacked bar chart, but add the alluvial over the top.
import pandas as pd import pyalluvial.alluvial as alluvial df = pd.DataFrame({ 'Group': [1, 2, 3], 'Ove': [0.7, 0.2, 0.1], 'Point 1': [0.8, 0.1, 0.1], 'Point 2': [0.6, 0.2, 0.2], 'Point 3': [0.7, 0.3, 0.0], }) ax = alluvial.plot( df = df, xaxis_names = ['Group','Point 1','Point 2', 'Point 3'], y_name = 'Ove', alluvium = 'Group', )
Output shows the overall group proportion (1st bar) being correct. But the following stacked bars with the proportions.
If I transform the df and put the Points as a single column, then I don’t get 3 separate bars.
Advertisement
Answer
As correctly pointed out by @darthbaba, pyalluvial
expects the dataframe format to consist of frequencies matching different variable-type combinations. To give you an example of a valid input, each Point
in each Group
has been labelled as present (1
) or absent (0
):
df = pd.DataFrame({ 'Group': [1] * 6 + [2] * 6 + [3] * 6, 'Point 1': [1, 1, 1, 1, 0, 0] * 3, 'Point 2': [0, 1, 0, 1, 1, 0] * 3, 'Point 3': [0, 0, 1, 1, 1, 1] * 3, 'freq': [23, 11, 5, 7, 10, 12, 17, 3, 6, 17, 19, 20, 28, 4, 13, 8, 14, 9] }) fig = alluvial.plot(df=df, xaxis_names=['Point 1','Point 2', 'Point 3'], y_name='freq', alluvium='Group', ignore_continuity=False)
Clearly, the above code doesn’t resolve the issue since pyalluvial
has yet to support the inclusion of stacked bars, much like how it’s implemented in ggalluvial
(see example #5). Therefore, unless you want to use ggalluvial
, your best option IMO is to add the required functionality yourself. I’d start by modifying line #85.