Delete rows based on calculated number

Question

I have a dataframe that defines list of call (call)[List]. each call has an answer status (call)[Status]. I created a column to have a unique field (call)[Key] Call DataFrame which appears as following: A seconde Dataframe with a calculated column (DimsDrops)[# drop] ; The join with the (Call)table is done with the (Call)[Key]. DimsDrops table which appears as following: I

Accepted Answer

EDIT:First repeat rows by number of deleted rows by Index.repeat and DataFrame.loc and create counter column by GroupBy.cumcount:df1 = DimsDrops.loc[DimsDrops.index.repeat(DimsDrops['# drop'])]df1['g'] = df1.groupby('Key').cumcount(ascending=False)print (df1)                Key  # drop  g0  List1 2022-02-09       2  10  List1 2022-02-09       2  01  List2 2022-02-09       1  02  List3 2022-02-09       1  04  List2 2022-02-10       1  05  List3 2022-02-10       1  0Then filter only rows with A and create counter in call:call['g'] = call[call['Status'].eq('A')].groupby('Key').cumcount(ascending=False)print (call)     List Status               Key    g0   List1      A  List1 2022-02-09  2.01   List1      A  List1 2022-02-09  1.02   List1     DO  List1 2022-02-09  NaN3   List1      A  List1 2022-02-09  0.04   List2      A  List2 2022-02-09  0.05   List3      A  List3 2022-02-09  0.06   List3     DO  List3 2022-02-09  NaN7   List1      C  List1 2022-02-10  NaN8   List2      A  List2 2022-02-10  1.09   List2      A  List2 2022-02-10  0.010  List3      A  List3 2022-02-10  0.0Join both DataFrames by left join and indicator=True parameter:df = call.merge(df1, how='left', indicator=True)print (df)     List Status               Key    g  # drop     _merge0   List1      A  List1 2022-02-09  2.0     NaN  left_only1   List1      A  List1 2022-02-09  1.0     2.0       both2   List1     DO  List1 2022-02-09  NaN     NaN  left_only3   List1      A  List1 2022-02-09  0.0     2.0       both4   List2      A  List2 2022-02-09  0.0     1.0       both5   List3      A  List3 2022-02-09  0.0     1.0       both6   List3     DO  List3 2022-02-09  NaN     NaN  left_only7   List1      C  List1 2022-02-10  NaN     NaN  left_only8   List2      A  List2 2022-02-10  1.0     NaN  left_only9   List2      A  List2 2022-02-10  0.0     1.0       both10  List3      A  List3 2022-02-10  0.0     1.0       bothSo last filter not both rows with remove helper g column:df = call[df['_merge'].ne('both')].drop('g', axis=1)print (df)    List Status               Key0  List1      A  List1 2022-02-092  List1     DO  List1 2022-02-096  List3     DO  List3 2022-02-097  List1      C  List1 2022-02-108  List2      A  List2 2022-02-10

Advertisement

Answer