Skip to content
Advertisement

Get several dataframe from an original one

I have the following dataframe:

df = pd.DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 
                   'Name': ['name1', 'name2', 'name3', 'name4', 'name5', 'name6', 
                            'name7', 'name8', 'name9', 'name10', 'name11', 'name12'],
                   'Category': ['A', 'A/B', 'B/C', 'A/B/C', 'A/B/C', 'B/C',
                                'A/B', 'A/B/C', 'A/B/C', 'B', 'C', 'A/C']})

I need to get a number of dataframes for each category. For instance, as output for category A:

df_a = pd.DataFrame({'ID': [1, 2, 4, 5, 7, 8, 9, 12], 
                   'Name': ['name1', 'name2', 'name4', 'name5',  
                            'name7', 'name8', 'name9', 'name12'],
                   'Category': ['A', 'A', 'A', 'A', 
                                'A', 'A', 'A', 'A']})

Advertisement

Answer

Let’s split the categories, explode the data frame and groupby:

df_dicts = {k:v for k,v in (df.assign(Category=df['Category'].str.split('/'))
                              .explode('Category')
                              .groupby('Category')
                           )
           }

And you get, for example df_dicts['A']:

    ID    Name Category
0    1   name1        A
1    2   name2        A
3    4   name4        A
4    5   name5        A
6    7   name7        A
7    8   name8        A
8    9   name9        A
11  12  name12        A
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement