I have a Pandas dataframe something like:
Feature A | Feature B | Feature C |
---|---|---|
A1 | B1 | C1 |
A2 | B2 | C2 |
Given k as input, i want all values combination grouped by feature of length k, for example for k = 2 I want:
JavaScript
x
13
13
1
[{A:A1, B:B1},
2
{A:A1, B:B2},
3
{A:A1, C:C1},
4
{A:A1, C:C2},
5
{A:A2, B:B1},
6
{A:A2, B:B2},
7
{A:A2, C:C1},
8
{A:A2, C:C2},
9
{B:B1, C:C1},
10
{B:B1, C:C2},
11
{B:B2, C:C1},
12
{B:B2, C:C2}]
13
How can I achieve that?
Advertisement
Answer
This is probably not that efficient but it works for small scale.
First, determine the unique combinations of k
columns.
JavaScript
1
4
1
from itertools import combinations
2
k = 2
3
cols = list(combinations(df.columns, k))
4
Then use MultiIndex.from_product
to get cartesian product of k
columns.
JavaScript
1
4
1
result = []
2
for c in cols:
3
result += pd.MultiIndex.from_product([df[x] for x in c]).values.tolist()
4