Make a customized filter on a grouped dataframe with multiple conditions

Question

Please find below my input/desired output : INPUT OUTPUT (desired) The goal is firstly to have one line per Id in the output. The output will be made based on a this simple statement : This is what I've tried so far : Do you have any suggestion/propositions, please ? Any help we be so much appreciated ! Answer Answer

Accepted Answer

Answer is completely edited &#8211; first filter only online rows, sorting by Date and remove duplicates by first Id:df1 = df[df['Status'].eq('online')].sort_values('Date').drop_duplicates('Id')print (df1)      Id  Status       Date5  Id004  online 2021-10-217  Id005  online 2022-02-011  Id001  online 2022-06-01Then filter not matched Id and sorting descending:df2 =df[~df['Id'].isin(df1['Id'])].sort_values('Date',ascending=False).drop_duplicates('Id')print (df2)      Id   Status       Date2  Id002      off 2021-12-054  Id003  running 2021-03-02Last join both Dataframes:df = pd.concat([df1, df2]).sort_values('Id', ignore_index=True)print (df)      Id   Status       Date0  Id001   online 2022-06-011  Id002      off 2021-12-052  Id003  running 2021-03-023  Id004   online 2021-10-214  Id005   online 2022-02-01Original solution should be changed:df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)df1 = (df.assign(s = df['Status'].eq('online')).groupby(['Id','s'])         .agg(Date_min=('Date','idxmin'), Date_max=('Date','idxmax')))df1 = df1[~df1.index.get_level_values(0).duplicated(keep='last')].reset_index()print (df1)      Id      s  Date_min  Date_max0  Id001   True         1         11  Id002  False         3         22  Id003  False         4         43  Id004   True         5         54  Id005   True         7         8df = df.loc[np.where(df1['s'], df1['Date_min'], df1['Date_max'])]print (df)      Id   Status       Date1  Id001   online 2022-06-012  Id002      off 2021-12-054  Id003  running 2021-03-025  Id004   online 2021-10-217  Id005   online 2022-02-01

INPUT

OUTPUT (desired)

Advertisement

Answer