I try to aggregate per ptid based on the diag_date, and calculate max, min and visit counts based on the diag_date:
output = df.groupby("ptid")["diag_date"].agg(Max_Pt_date="max", Min_Pt_date="min", Num_Visits="count") output["TreatDuration"] = (output["Max_Pt_date"]-output["Min_Pt_date"]).dt.days
However, when I do the above (following all rules for agg) does not seem to work as I get the following error:
<class 'TypeError'>: aggregate() missing 1 required positional argument: 'func_or_funcs'
Any ideas, are greatly appreciated!
Advertisement
Answer
To answer my questions, after getting valuable inputs from responders above, I modified the code environment in Dataiku (which had a new Python and Pandas version) and it worked fine.