I have dataframe, where ‘A’ 1 – client, B – admin I need to merge messages in row with 1 sequentially and merge lines 2 – admin response sequentially across the dataframe.
JavaScript
x
32
32
1
df1 = pd.DataFrame({'A' : ['a', 'b', 'c', 'd', 'e', 'f', 'h', 'j', 'de', 'be'],
2
'B' : [1, 1, 2, 1, 1, 1, 2, 2, 1, 2]})
3
4
df1
5
6
A B
7
8
A B
9
0 a 1
10
1 b 1
11
2 c 2
12
3 d 1
13
4 e 1
14
5 f 1
15
6 h 2
16
7 j 2
17
8 de 1
18
9 be 2
19
20
21
I need to get in the end this dataframe:
22
23
df2 = pd.DataFrame({'A' : ['a, b', 'd, e, f', 'de'],
24
'B' : ['c', 'h, j', 'be' ]})
25
26
Out:
27
28
A B
29
0 a,b c
30
1 d,e,f h,j
31
2 de be
32
I do not know how to do this
Advertisement
Answer
Create groups by consecutive values in B
– trick compare shifted values with cumulative sum and aggregate first
and join. Create helper column for posible pivoting in next step by DataFrame.pivot
:
Solution working if exist pairs 1,2
in sequentially order with duplicates.
JavaScript
1
23
23
1
df = (df1.groupby(df1['B'].ne(df1['B'].shift()).cumsum())
2
.agg(B = ('B','first'), A= ('A', ','.join))
3
.assign(C = lambda x: x['B'].eq(1).cumsum()))
4
5
print (df)
6
B A C
7
B
8
1 1 a,b 1
9
2 2 c 1
10
3 1 d,e,f 2
11
4 2 h,j 2
12
5 1 de 3
13
6 2 be 3
14
15
df = (df.pivot('C','B','A')
16
.rename(columns={1:'A',2:'B'})
17
.reset_index(drop=True).rename_axis(None, axis=1))
18
print (df)
19
A B
20
0 a,b c
21
1 d,e,f h,j
22
2 de be
23