I have this dataframe:
A B 0 [0, 1, 2] 1 1 foo 1 2 [3, 4] 1
I would like to use explode function for column “A” and then to keep right and fair proportion for each exploded row in case with column “B” . So the result should look like this:
A B 0 0 0.33 0 1 0.33 0 2 0.33 1 foo 1 2 3 0.5 2 4 0.5
Would this be possible with the explode function? I would manage to come to this result with for row in data.itertuples():
but the for loop is so slow in case with large dataframe. So do you have idea how to solve this with explode or with some other fast way?
I would be very grateful with any help.
Advertisement
Answer
You can explode
“A”; then groupby
the index and transform
count
method (to count the number of each index) and divide the elements in 'B'
by their corresponding index count.
out = df.explode('A') out['B'] /= out['B'].groupby(level=0).transform('count')
Output:
A B 0 0 0.333333 0 1 0.333333 0 2 0.333333 1 foo 1.000000 2 3 0.500000 2 4 0.500000