How are you?
I have a database where some lines have more than one product and they are separated by a comma, as in the example below (there are other columns, but to make it more practical I only took these three).
id | produdct | value |
---|---|---|
47 | product1, product 2 | 12000.0 |
48 | product3 | 48000.0 |
49 | product4, product1, product2 | 28800.0 |
50 | product1 | 2000.0 |
51 | product5, product2 | 32000.0 |
53 | product3 | 128000.0 |
54 | product2 | 15000.0 |
55 | product4, product2, product5 | 96000.0 |
I need to separate each product, making a copy of that line for each one. I tried using some functions like explode, json_normalize, I tried creating a list of lists but nothing worked. Can you help me?
Advertisement
Answer
Just use str.split
and explode
JavaScript
x
19
19
1
df['produdct'] = df['produdct'].str.split(', ')
2
new_df = df.explode('produdct')
3
4
id produdct value
5
0 47 product1 12000.0
6
0 47 product 2 12000.0
7
1 48 product3 48000.0
8
2 49 product4 28800.0
9
2 49 product1 28800.0
10
2 49 product2 28800.0
11
3 50 product1 2000.0
12
4 51 product5 32000.0
13
4 51 product2 32000.0
14
5 53 product3 128000.0
15
6 54 product2 15000.0
16
7 55 product4 96000.0
17
7 55 product2 96000.0
18
7 55 product5 96000.0
19