I have a dataframe that looks like his
JavaScript
x
9
1
_____________________
2
|col1 | col2 | col3 |
3
---------------------
4
| a | b | c |
5
| d | b | c |
6
| e | f | g |
7
| h | f | j |
8
---------------------
9
I want to get a dictionary structure that looks as follows
JavaScript
1
5
1
{
2
b : { col1: [a,d], col2: b, col3: c},
3
f : { col1: [e, h], col2: f, col3: [g, j]}
4
}
5
I have seen this answer. But it seems like overkill for what I want to do as it converts every value of the key inside the nested dictionary into a list. I would only like to convert col1
into a list when creating the dictionary. Is this possible?
Advertisement
Answer
Use custom lambda function for return unique values in list if there is multiple them else scalar in lambda function:
JavaScript
1
8
1
d = (df.set_index('col2', drop=False)
2
.groupby(level=0)
3
.agg(lambda x: list(set(x)) if len(set(x)) > 1 else list(set(x))[0])
4
.to_dict('index'))
5
print (d)
6
{'b': {'col1': ['d', 'a'], 'col2': 'b', 'col3': 'c'},
7
'f': {'col1': ['h', 'e'], 'col2': 'f', 'col3': ['j', 'g']}}
8
If order is important use dict.fromkeys
for remove duplicates:
JavaScript
1
8
1
d = (df.set_index('col2', drop=False)
2
.groupby(level=0)
3
.agg(lambda x: list(dict.fromkeys(x)) if len(set(x)) > 1 else list(set(x))[0])
4
.to_dict('index'))
5
print (d)
6
{'b': {'col1': ['a', 'd'], 'col2': 'b', 'col3': 'c'},
7
'f': {'col1': ['e', 'h'], 'col2': 'f', 'col3': ['g', 'j']}}
8