I have a dataframe:
JavaScript
x
4
1
c1. c2. c3. l
2
1. 2. 3 [1,2,3,4,5,6,7]
3
3. 4. 8. [8,9,0]
4
I want explode it such that every 3 elements from each list in the column l will be a new row, and the column for the triplet index within the original list. So I will get:
JavaScript
1
5
1
c1. c2. c3. l idx
2
1. 2. 3 [1,2,3]. 0
3
1. 2. 3. [4,5,6]. 1
4
3. 4. 8. [8,9,0]. 0
5
What is the best way to do so?
Advertisement
Answer
Break list element into chunks first and then explode
:
JavaScript
1
13
13
1
df.l = df.l.apply(lambda lst: [lst[3*i:3*(i+1)] for i in range(len(lst) // 3)])
2
3
df
4
# c1 c2 c3 l
5
#0 1 2 3 [[1, 2, 3], [4, 5, 6]]
6
#1 3 4 8 [[8, 9, 0]]
7
8
df.explode('l')
9
# c1 c2 c3 l
10
#0 1 2 3 [1, 2, 3]
11
#0 1 2 3 [4, 5, 6]
12
#1 3 4 8 [8, 9, 0]
13
If you need the index column:
JavaScript
1
23
23
1
# store index as second element of the tuple
2
df.l = df.l.apply(lambda lst: [(lst[3*i:3*(i+1)], i) for i in range(len(lst) // 3)])
3
4
df
5
# c1 c2 c3 l
6
#0 1 2 3 [([1, 2, 3], 0), ([4, 5, 6], 1)]
7
#1 3 4 8 [([8, 9, 0], 0)]
8
9
df = df.explode('l')
10
df
11
# c1 c2 c3 l
12
#0 1 2 3 ([1, 2, 3], 0)
13
#0 1 2 3 ([4, 5, 6], 1)
14
#1 3 4 8 ([8, 9, 0], 0)
15
16
# extract list and index from the tuple column
17
df['l'], df['idx'] = df.l.str[0], df.l.str[1]
18
df
19
# c1 c2 c3 l idx
20
#0 1 2 3 [1, 2, 3] 0
21
#0 1 2 3 [4, 5, 6] 1
22
#1 3 4 8 [8, 9, 0] 0
23