I have a dataframe like this
country data_fingerprint organization US 111 Tesco UK 222 IBM US 111 Yahoo PY 333 Tesco US 111 Boeing CN 333 TCS NE 458 Yahoo UK 678 Tesco
I want those data_fingerprint for where the organisation and country with top 2 counts exists
So if see in organization top 2 occurrences are for Tesco,Yahoo and for country we have US,UK .
So based on that the output of data_fingerprint should be having
data_fingerprint 111 678
What i have tried for organization to exist in my complete dataframe is this
# First find top 2 occurances of organization nd = df['organization'].value_counts().groupby(level=0, group_keys=False).head(2) # Then checking if the organization exist in the complete dataframe and filtering those rows new = df["organization"].isin(nd)
But i am not getting any data here.Once i get data for this I can do it along with country Can someone please help to get me the output.I have less data so using Pandas
Advertisement
Answer
here is one way to do it
df[ df['organization'].isin(df['organization'].value_counts().head(2).index) & df['country'].isin(df['country'].value_counts().head(2).index) ]['data_fingerprint'].unique()
array([111, 678], dtype=int64)