How can a create a percentage matrix based on a dataframe

Question

I have a dataframe that looks like that : Place A Place B Type Number New York Paris A 34 Oslo London B 42 Oslo London A 24 i need to have the percentage number of each type according to the routes. I don't know witch command to use to get a dataframe that looks like this xxx Paris Oslo

Accepted Answer

IUUC, you could reshape the data, then compute the percent per group, finally reshape again:cities = set([*df['Place A'], *df['Place B']])df['total'] = df.groupby(['Place A', 'Place B'])['Number'].transform('sum')df['percent'] = df['Number']/df['total']df['value'] = df['Type']+': '+df['percent'].round(2).astype(str)df2 = (df.groupby(['Place A', 'Place B'])['value']         .agg(', '.join).unstack(1)         .rename_axis(index=None, columns=None)         .reindex(index=cities, columns=cities)      )# make output symmetricaldf2.update(df2.T)output:                      Oslo New York   Paris            LondonOslo                   NaN      NaN     NaN  B: 0.64, A: 0.36New York               NaN      NaN  A: 1.0               NaNParis                  NaN   A: 1.0     NaN               NaNLondon    B: 0.64, A: 0.36      NaN     NaN               NaNIf you don&#8217;t need the symmetrical output, you can keep it by PlaceA/PlaceB, which will have the same information in condensed form:df2 = df.groupby(['Place A', 'Place B'])['value'].agg(', '.join).unstack(1)output:Place B             London   ParisPlace A                           New York               NaN  A: 1.0Oslo      B: 0.64, A: 0.36     NaN

xxx	Paris	Oslo	London	New York
Paris	NaN	NaN	NaN	type A : 100%
Oslo	NaN	NaN	type A : 36,3% type B : 63,6%	NaN
London	NaN	type A : 36,3% type B : 63,6%	NaN	NaN
New York	type A : 100%	NaN	NaN	NaN

Advertisement

Answer