I have the following dataframe:
df = pd.DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'Name': ['name1', 'name2', 'name3', 'name4', 'name5', 'name6', 'name7', 'name8', 'name9', 'name10', 'name11', 'name12'], 'Category': ['A', 'A/B', 'B/C', 'A/B/C', 'A/B/C', 'B/C', 'A/B', 'A/B/C', 'A/B/C', 'B', 'C', 'A/C']})
I need to get a number of dataframes for each category. For instance, as output for category A:
df_a = pd.DataFrame({'ID': [1, 2, 4, 5, 7, 8, 9, 12], 'Name': ['name1', 'name2', 'name4', 'name5', 'name7', 'name8', 'name9', 'name12'], 'Category': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']})
Advertisement
Answer
Let’s split the categories, explode the data frame and groupby:
df_dicts = {k:v for k,v in (df.assign(Category=df['Category'].str.split('/')) .explode('Category') .groupby('Category') ) }
And you get, for example df_dicts['A']
:
ID Name Category 0 1 name1 A 1 2 name2 A 3 4 name4 A 4 5 name5 A 6 7 name7 A 7 8 name8 A 8 9 name9 A 11 12 name12 A