Skip to content
Advertisement

All combinations of a row in a dataframe

i have the following Dataframe (df = ) with around 40 mio rows.

JavaScript

i try to have the following output:

JavaScript

at first i thought to use itertools combinations, it.combinations(Colors[“Colors”],2), but the problem was, that it gives me the combinations of the whole column and don’t correlate to the column “No”. The next try was to aggregate the whole dataframe to have all the needed combination in a list and only have about 5000 rows

from:

JavaScript

to:

JavaScript

with: df.apply(lambda x: list(it.combinations(x,2), axis =1) but this also doesn’t work (all combinations in each row).

What is the right solution to achieve the wanted output (of attempt 1 or attempt 2)?

Edit: 1

if i try to use df.apply(lambda x: list(it.combinations(x,2), axis =1) i generate following column

JavaScript

i think on problem is, i aggregate the Colors by a tuple or list (tuple is empty []). df.groupby("No")["Color"].apply(list).agg(tuple).to_frame()

nevertheless the itertool gives me a combination of every column.

Edit 2: the solutions of alparslan mimaroğlu and Henry Vik work both and are (for me) astonishing. Till now i cannot understand the logic behind these, but i’ll try! Thanks!

Advertisement

Answer

You can groupby by No and create the lists you want quite easily.

JavaScript

if you don’t make it explode it will return you a list of color combinations

Advertisement