if u see the screenshot in that f4,f5, and f9 columns values are separated by commas, i want to split that values into different rows, and f9 is a total number of products, so I need to split the revenue as well based on quantity, for example total number of products according to f9 is 5, so total revenue is 211.32, so each row it should be (211.32/5)*2(quantity value)
I tried this:
df3['Product_Name']=df3.Product_Name.str.split(",").explode('Product_Name') df3['Product_ID']=df3.Product_ID.str.split(",").explode('Product_ID') df3['Number_of_Products']=df3.Number_of_Products.str.split(",").explode('Number_of_Products')
but didn’t work..
Advertisement
Answer
You have to develop a specific function to do this (with or without pandas).
Let’s consider you have a classic Python list:
row = [ "10/4/2017 3:31", # f1 112, # f2 "220.239.97", # f3 "x,y,z", # f4 "x1,y1,z1", # f5 "0", # f6 "female", # f7 "cc@braint", # f8 "2,2,1", # f9 "East Lindf", # f10 "NSW", # f11 211.32, # Revenue ]
you can create a small function which reads the fields and “split” the columns. Here is an example:
def split_columns(r): f4 = r[3].split(",") f5 = r[4].split(",") f9 = list(map(int, r[8].split(","))) revenue = r[11] amount = revenue / sum(f9) for v4, v5, v9 in zip(f4, f5, f9): r[3] = v4 r[4] = v5 r[8] = v9 r[11] = amount * v9 yield r
This generator can be used like this:
for line in split_columns(row): print(line)
You get:
['10/4/2017 3:31', 112, '220.239.97', 'x', 'x1', '0', 'female', 'cc@braint', 2, 'East Lindf', 'NSW', 84.52799999999999] ['10/4/2017 3:31', 112, '220.239.97', 'y', 'y1', '0', 'female', 'cc@braint', 2, 'East Lindf', 'NSW', 84.52799999999999] ['10/4/2017 3:31', 112, '220.239.97', 'z', 'z1', '0', 'female', 'cc@braint', 1, 'East Lindf', 'NSW', 42.263999999999996]