Pandas filtering based on minimum data occurrences across multiple columns

I have a dataframe like this

country     data_fingerprint   organization     
US          111                Tesco         
UK          222                IBM          
US          111                Yahoo           
PY          333                Tesco
US          111                Boeing   
CN          333                TCS  
NE          458                Yahoo
UK          678                Tesco

I want those data_fingerprint for where the organisation and country with top 2 counts exists

So if see in organization top 2 occurrences are for Tesco,Yahoo and for country we have US,UK .

So based on that the output of data_fingerprint should be having

data_fingerprint
111
678

What i have tried for organization to exist in my complete dataframe is this

# First find top 2 occurances of organization
nd = df['organization'].value_counts().groupby(level=0, group_keys=False).head(2)
# Then checking if the organization exist in the complete dataframe and filtering those rows
new = df["organization"].isin(nd)

But i am not getting any data here.Once i get data for this I can do it along with country Can someone please help to get me the output.I have less data so using Pandas

Answer

here is one way to do it

df[
    df['organization'].isin(df['organization'].value_counts().head(2).index) &
    df['country'].isin(df['country'].value_counts().head(2).index)
]['data_fingerprint'].unique()

array([111, 678], dtype=int64)

Advertisement

Answer