if I have a dataset like-
Subject_ID DRUG LOS
2456 Syringe (Neonatal) *NS* 1.56
2456 Heparin 1.56
12345 Syringe (Neonatal) *NS* 0.78
12345 ampicillin 0.78
12345 gentamicin 0.78
As output, I want that the Drug name with the same Subject_ID will concatenate in one row.
Subject_ID DRUG LOS
2456 Syringe (Neonatal) *NS*, Heparin 1.56
12345 Syringe (Neonatal) *NS*, ampicillin, gentamicin 0.78
How can I do that in Python pandas?
Advertisement
Answer
Group the dataframe by Subject_ID then call agg with ', '.join as aggregate for DRUG column, and first as aggregate for LOS column.
>>> df.groupby(['Subject_ID']).agg({'DRUG':', '.join, 'LOS':'first'})
DRUG LOS
Subject_ID
2456 Syringe (Neonatal) *NS*, Heparin 1.56
12345 Syringe (Neonatal) *NS*, ampicillin, gentamicin 0.78