I have a df
JavaScript
x
8
1
id val1 val2
2
1 1.1 2.2
3
1 1.1 2.2
4
2 2.1 5.5
5
3 8.8 6.2
6
4 1.1 2.2
7
5 8.8 6.2
8
I want to group by val1 and val2
and get similar dataframe only with rows which has multiple occurance of same val1 and val2
combination.
Final df
:
JavaScript
1
6
1
id val1 val2
2
1 1.1 2.2
3
4 1.1 2.2
4
3 8.8 6.2
5
5 8.8 6.2
6
Advertisement
Answer
You need duplicated
with parameter subset
for specify columns for check with keep=False
for all duplicates for mask and filter by boolean indexing
:
JavaScript
1
9
1
df = df[df.duplicated(subset=['val1','val2'], keep=False)]
2
print (df)
3
id val1 val2
4
0 1 1.1 2.2
5
1 1 1.1 2.2
6
3 3 8.8 6.2
7
4 4 1.1 2.2
8
5 5 8.8 6.2
9
Detail:
JavaScript
1
9
1
print (df.duplicated(subset=['val1','val2'], keep=False))
2
0 True
3
1 True
4
2 False
5
3 True
6
4 True
7
5 True
8
dtype: bool
9