Skip to content
Advertisement

Pandas, get all possible value combinations of length k grouped by feature

I have a Pandas dataframe something like:

Feature A Feature B Feature C
A1 B1 C1
A2 B2 C2

Given k as input, i want all values combination grouped by feature of length k, for example for k = 2 I want:

[{A:A1, B:B1},
 {A:A1, B:B2},
 {A:A1, C:C1},
 {A:A1, C:C2},
 {A:A2, B:B1},
 {A:A2, B:B2},
 {A:A2, C:C1},
 {A:A2, C:C2},
 {B:B1, C:C1},
 {B:B1, C:C2},
 {B:B2, C:C1},
 {B:B2, C:C2}]

How can I achieve that?

Advertisement

Answer

This is probably not that efficient but it works for small scale.

First, determine the unique combinations of k columns.

from itertools import combinations
k = 2
cols = list(combinations(df.columns, k))

Then use MultiIndex.from_product to get cartesian product of k columns.

result = []
for c in cols:
    result += pd.MultiIndex.from_product([df[x] for x in c]).values.tolist()
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement