What is the best way to filter rows of one dataframe based on column entries of another dataframe

Question

I have two dataframes in python, one called DayList, with these columns: OrderNr Powder Variant Quantity DueDate, and another one called Planning, with these columns: Order Start End Day Powder Variant Task. Both dataframes will have multiple lines with specific combinations, the column entries for Powder and Variant will be an integer, I want to filter the dataframe DayList into

Accepted Answer

You could write a function to determine the category for a given row of the dataframe, and then use df.apply(). To avoid having to pick out the right columns within the function, you could apply it only to the reduced dataframe, consisting of just the Powder and Variant columns:import pandas as pd# example dataframes with just the relevant columns, but # the code below also works for dataframes containing additional columnsDayList = pd.DataFrame({'Powder': [1, 2, 3, 4, 5, 6],                        'Variant': [1, 2, 1, 2, 1, 2]})Planning = pd.DataFrame({'Powder': [3, 4, 5, 6],                         'Variant': [1, 2, 2, 1]})def determine_category(row):    powder, variant = row.values    if [powder, variant] in Planning[['Powder', 'Variant']].values.tolist():        return 1    if powder in Planning['Powder'].values:        return 2    return 3DayList['Category'] = DayList[['Powder', 'Variant']].apply(                      determine_category, axis=1)DayList    Powder  Variant  Category0   1       1        31   2       2        32   3       1        13   4       2        14   5       1        25   6       2        2

Advertisement

Answer