I would like to transform a dataframe with the following layout:
| image | finding1 | finding2 | nofinding | | ------- | -------- | -------- | --------- | | 039.png | true | false | false | | 006.png | true | true | false | | 012.png | false | false | true |
into a dictionary with the following structure:
{
"039.png" : [
"finding1"
],
"006.png" : [
"finding1",
"finding2"
],
"012.png" : [
"nofinding"
]}
Advertisement
Answer
IIUC, you could replace the False to NA (assuming boolean False here, for strings use ‘false’), then stack to remove the values and use groupby.agg to aggregate as list before converting to dictionary:
dic = (df
.set_index('image')
.replace({False: pd.NA})
.stack()
.reset_index(1)
.groupby(level='image', sort=False)['level_1'].agg(list)
.to_dict()
)
output:
{'039.png': ['finding1'],
'006.png': ['finding1', 'finding2'],
'012.png': ['nofinding']}