Python – Filtering dataframe based on 3 columns potentially containing a sought after value

Question

I'm trying to take a query of recent customer transactions and match potential primary phone, cellphone and work phone matches against a particular list of customers I have. Essentially, I am taking one dataframe column (the list of customers I am trying to see if they had transactions recently) against the overall universe of all recent transactions (dataframe being transaction_data)

Accepted Answer

The thing here is that applying the .isin() method of a Series to another Series will return a boolean Series. In your example transaction_data['phone'] is a Series, and also df['phone']. The return of this method will be a boolean Series containing the value True in a row which the value in transaction_data['phone'] appears in df['phone'] and False otherwise. This is similar for all applications of isin() method in your example.And, good news! This boolean Series is exactly what is needed for slicing the dataframe. Therefor your code just need a small tweak. Just delete the .any() at the end of the line.transaction_data[(transaction_data['phone'].isin(df['phone'])) | (transaction_data['cell'].isin(df['phone'])) | (transaction_data['workphone'].isin(df['phone']))]

Advertisement

Answer