Skip to content
Advertisement

Split data frame into multiple data frames based on a group of parameters in a column

I’ve got a data frame like this:

DF

ID      A       B       C
00      X0      Y0      PARAMETER_0
01      X1      Y1      PARAMETER_1
02      X2      Y2      PARAMETER_2
03      X3      Y3      PARAMETER_3
04      X4      Y4      PARAMETER_4
05      X5      Y5      PARAMETER_0
06      X6      Y6      PARAMETER_1
07      X7      Y7      PARAMETER_2
08      X8      Y8      PARAMETER_3
09      X9      Y9      PARAMETER_4
10      XX0     YY0     PARAMETER_0
11      XX1     YY1     PARAMETER_1
12      XX2     YY2     PARAMETER_2
13      XX3     YY3     PARAMETER_3
14      XX4     YY4     PARAMETER_4

And I need to split it in multiple data frames by PARAMETER_4 in C column, to get:

DF_1

ID      A       B       C
00      X0      Y0      PARAMETER_0
01      X1      Y1      PARAMETER_1
02      X2      Y2      PARAMETER_2
03      X3      Y3      PARAMETER_3
04      X4      Y4      PARAMETER_4

DF_2

05      X5      Y5      PARAMETER_0
06      X6      Y6      PARAMETER_1
07      X7      Y7      PARAMETER_2
08      X8      Y8      PARAMETER_3
09      X9      Y9      PARAMETER_4

DF_3

10      XX0     YY0     PARAMETER_0
11      XX1     YY1     PARAMETER_1
12      XX2     YY2     PARAMETER_2
13      XX3     YY3     PARAMETER_3
14      XX4     YY4     PARAMETER_4

I cannot find any easy-way function like df.split(axis=0, value='PARAMETER_4')

Any idea about an approach? Thank you in advance!

Advertisement

Answer

We can use groupby twice here. First we groupby on column C and make a cumcount. Then we groupby on this cumcount to get the seperate dataframes:

dfs = [d for _, d in df.groupby(df.groupby('C').cumcount())]

print(dfs[0], 'n')
print(dfs[1], 'n')
print(dfs[2])

Output

   ID   A   B            C
0   0  X0  Y0  PARAMETER_0
1   1  X1  Y1  PARAMETER_1
2   2  X2  Y2  PARAMETER_2
3   3  X3  Y3  PARAMETER_3
4   4  X4  Y4  PARAMETER_4 

   ID   A   B            C
5   5  X5  Y5  PARAMETER_0
6   6  X6  Y6  PARAMETER_1
7   7  X7  Y7  PARAMETER_2
8   8  X8  Y8  PARAMETER_3
9   9  X9  Y9  PARAMETER_4 

    ID    A    B            C
10  10  XX0  YY0  PARAMETER_0
11  11  XX1  YY1  PARAMETER_1
12  12  XX2  YY2  PARAMETER_2
13  13  XX3  YY3  PARAMETER_3
14  14  XX4  YY4  PARAMETER_4
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement