I have a DataFrame with 31 columns, which contains 3 categories “Classic”, “Premium” and “Luxe” I want to swap the way the DataFrame works to have only 3 comumns “Classic”, “Premium” and “Luxe” and 31 categories which can be listed inside.
Since I’m new I can only post a link to the picture for more clarity : Here is what I have and what I want to do
Here’s what I tried so far:
Suppose we use this DataFrame
import pandas as pd import numpy as np d = {'Name': ["Alban","Benan","Conor","Dino","Egor","Fatima" ],'Virtual visits': ["Classique", "Classique","","" ,"Premium","Premium"], 'Cars' :["","" ,"Luxe","Luxe","Luxe", "Premium"],'Portraits' :["","" ,"Classique","Classique","Luxe", "Premium"],'Animals' :["Premium","Luxe" ,"","","Luxe", ""]} df = pd.DataFrame(data=d) df Name Virtual visits Cars Portraits Animals 0 Alban Classique Premium 1 Benan Classique Luxe 2 Conor Luxe Classique 3 Dino Luxe Classique 4 Egor Premium Luxe Luxe Luxe 5 Fatima Premium Premium Premium
Using melt looked promising so I did that and a groupBy:
df = df.melt(id_vars=["Name"]) df.groupby(by=["Name"]).sum() Name variable value Alban Virtual visitsCarsPortraitsAnimals ClassiquePremium Benan Virtual visitsCarsPortraitsAnimals ClassiqueLuxe Conor Virtual visitsCarsPortraitsAnimals LuxeClassique Dino Virtual visitsCarsPortraitsAnimals LuxeClassique Egor Virtual visitsCarsPortraitsAnimals PremiumLuxeLuxeLuxe Fatima Virtual visitsCarsPortraitsAnimals PremiumPremiumPremium
So the column “Variable” is made up of all the column names, which is not bad but instead of a single string added to one another, I would like to have a list, and instead of columns Variable and Value, I would need to have 3 columns, “Premium”, “Classque”, “Luxe” and in each of them have the relevant list of categories appear.
Thanks for your help.
Advertisement
Answer
You were right with melt. After that you need a pivot table:
out = df.replace('',np.nan).melt("Name") .pivot_table(index="Name",columns="value",values="variable",aggfunc=','.join) .rename_axis(None,axis=1).reset_index()
print(out) Name Classique Luxe 0 Alban Virtual visits NaN 1 Benan Virtual visits Animals 2 Conor Portraits Cars 3 Dino Portraits Cars 4 Egor NaN Cars,Portraits,Animals 5 Fatima NaN NaN Premium 0 Animals 1 NaN 2 NaN 3 NaN 4 Virtual visits 5 Virtual visits,Cars,Portraits