np.select pandas dataframe based on column of prefix and values

Question

So I have two dataframes main_df, about 800 rows description category ABCD ONE XYZ THREE ABC QWE keyword_df, it is about 50 rows keyword category AB FIVE What I'm trying to achieve = main_df description category ABCD ONE XYZ THREE ABC FIVE QWE 0 conditions = [(main_df['Description'].str.startswith('AB')) & (main_df['category').isnull()] values = keyword_df['category'].tolist() main_df['category'] = np.select(conditions, values) I was able to

Accepted Answer

Since you only have 50 rows in the keyword frame, you could just iterate over those and update the main frame accordingly:import numpy as npimport pandas as pdmain_df = pd.DataFrame({'description': ['ABCD', 'XYZ', 'ABC', 'QWE'],                        'category': ['ONE', 'THREE', np.nan, np.nan]})keyword_df = pd.DataFrame({'keyword': ['AB'],                           'category': ['FIVE']}) for key in keyword_df.itertuples(index=False):    mask = (main_df['description'].str.startswith(key[0])             & main_df['category'].isnull())    main_df.loc[mask, 'category'] = key[1] main_df    description   category0   ABCD          ONE1   XYZ           THREE2   ABC           FIVE3   QWE           NaN

Advertisement

Answer