How are you?
I have a database where some lines have more than one product and they are separated by a comma, as in the example below (there are other columns, but to make it more practical I only took these three).
id | produdct | value |
---|---|---|
47 | product1, product 2 | 12000.0 |
48 | product3 | 48000.0 |
49 | product4, product1, product2 | 28800.0 |
50 | product1 | 2000.0 |
51 | product5, product2 | 32000.0 |
53 | product3 | 128000.0 |
54 | product2 | 15000.0 |
55 | product4, product2, product5 | 96000.0 |
I need to separate each product, making a copy of that line for each one. I tried using some functions like explode, json_normalize, I tried creating a list of lists but nothing worked. Can you help me?
Advertisement
Answer
Just use str.split
and explode
df['produdct'] = df['produdct'].str.split(', ') new_df = df.explode('produdct') id produdct value 0 47 product1 12000.0 0 47 product 2 12000.0 1 48 product3 48000.0 2 49 product4 28800.0 2 49 product1 28800.0 2 49 product2 28800.0 3 50 product1 2000.0 4 51 product5 32000.0 4 51 product2 32000.0 5 53 product3 128000.0 6 54 product2 15000.0 7 55 product4 96000.0 7 55 product2 96000.0 7 55 product5 96000.0