I have a dataframe like this
JavaScript
x
10
10
1
country data_fingerprint organization
2
US 111 Tesco
3
UK 222 IBM
4
US 111 Yahoo
5
PY 333 Tesco
6
US 111 Boeing
7
CN 333 TCS
8
NE 458 Yahoo
9
UK 678 Tesco
10
I want those data_fingerprint for where the organisation and country with top 2 counts exists
So if see in organization top 2 occurrences are for Tesco,Yahoo and for country we have US,UK .
So based on that the output of data_fingerprint should be having
JavaScript
1
4
1
data_fingerprint
2
111
3
678
4
What i have tried for organization to exist in my complete dataframe is this
JavaScript
1
5
1
# First find top 2 occurances of organization
2
nd = df['organization'].value_counts().groupby(level=0, group_keys=False).head(2)
3
# Then checking if the organization exist in the complete dataframe and filtering those rows
4
new = df["organization"].isin(nd)
5
But i am not getting any data here.Once i get data for this I can do it along with country Can someone please help to get me the output.I have less data so using Pandas
Advertisement
Answer
here is one way to do it
JavaScript
1
5
1
df[
2
df['organization'].isin(df['organization'].value_counts().head(2).index) &
3
df['country'].isin(df['country'].value_counts().head(2).index)
4
]['data_fingerprint'].unique()
5
JavaScript
1
2
1
array([111, 678], dtype=int64)
2