Skip to content
Advertisement

How can I create a column in one DataFrame containing the count of matches in another DataFrame’s column?

I have two pandas DataFrames. The first one, df1, contains a column of file paths and a column of lists containing what users have read access to these file paths. The second DataFrame, df2, contains a list of all possible users. I’ve created an example below:

JavaScript
JavaScript

The end goal is to create a new column df2['read_count'], which should take each user string from df2['user'] and find the total number of matches in the column df1['read'].

The expected output would be exactly that – a count of matches of each user string in the column of lists in df1['read']. Here is what I am expecting based on the example:

JavaScript

I tried putting something together using another question and list comprehension, but no luck. Here is what I currently have:

JavaScript

What is wrong with the code I currently have? I’ve tried actually following through the loops but it all seemed right, but it seems like my code can’t detect the matches I want.

Advertisement

Answer

You can use:

JavaScript

Or, if you really need to use a different kind of aggregation that depends on “path”:

JavaScript

output:

JavaScript

Without df2:

JavaScript

output:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement