Skip to content
Advertisement

Looping through a filtered dataframe to see if a value is in a list column

Apologies for the vague title, I’m not entirely sure how to word it more correctly. I have a DataFrame like this:

JavaScript

Which is created with this:

JavaScript

And the logic behind it is that each line is a customer record: they can only ever save one product at a time (which is why savedProduct has one product code) but they can purchase multiple products, which is why purchasedProduct contains a list. What I want to do is:

  • By customerID, get unique productIDs in savedProduct
  • By unique productID in this column, see if they appear in purchasedProduct
  • If they appear, pull the date column from the line in which purchasedProduct appears so I can calculate the amount of days between savedProduct and purchasedProduct

So e.g., the product in line 1 appears in line 3 so preferably there’d be a way to have both the first line’s date (2021-01-01) and third line’s date (2021-01-03) in the same row so we can calculate difference between the dates.

I thought a nested loop would do the job but I can’t get it to work (and there must be a more efficient way..):

JavaScript

But the output is like this:

JavaScript

Is there any way to do this better and also, why is customerID producing NaNs? When I have the output (the print in the loop) it works fine

Thanks for any help!

EDIT – may have just figured it out using lists instead but if someone has a more efficient way, would still be appreciated!

JavaScript

which has the following output:

JavaScript

Advertisement

Answer

Try:

JavaScript

This will first explode the rows with lists in purchasedProducts, so it creates a seperate row for each item in the list. Then it adds a purchase date column, so you can determine on row level if and when the product is bought.

JavaScript

Of course you can filter the df to only have rows with saved products:

df.loc[df.saved==1]

JavaScript

Or with only certain columns:

df.loc[df.saved==1, ['customerID', 'savedProduct', 'date',`'purchase_date']]

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement