JavaScript
x
13
13
1
Group Code
2
1 2
3
1 2
4
1 4
5
1 1
6
2 4
7
2 1
8
2 2
9
2 3
10
2 1
11
2 1
12
2 3
13
Within each group there are pairs. In Group 1 for example; the pairs are (2,2),(2,4),(4,1)
I want to filter these pairs based on code numbers 2 AND 4 being present at BOTH ends(not either)
In group 1 for example, only (2,4) will be kept while (2,2) and (4,1) will be filtered out. Excepted Output:
JavaScript
1
5
1
Group Code
2
1 2
3
1 4
4
5
Advertisement
Answer
You can approach by making 2 boolean masks for current row and next row code in 2 or 4. Then, form the required combination condition of present at BOTH ends(not either)
, as follows:
If you require both 2 AND 4 be present in the pair, then we can make another boolean mask for asserting that these 2 consecutive codes are not equal:
JavaScript
1
9
1
m_curr = df['Code'].isin([2,4]) # current row code is 2 or 4
2
m_next = df.groupby("Group")['Code'].shift(-1).isin([2,4]) # next row code in same group is 2 or 4
3
m_diff = df['Code'].ne(df.groupby("Group")['Code'].shift(-1)) # different row codes in current and next row in the same group
4
5
# current row AND next row code in 2 or 4 AND (2 and 4 both present, i.e. the 2 values in pair are diffrent)
6
mask = m_curr & m_next & m_diff
7
8
df[mask | mask.shift()]
9
Result:
JavaScript
1
4
1
Group Code
2
1 1 2
3
2 1 4
4
Another way to do it, may be a little bit simpler for this special case:
JavaScript
1
7
1
m1 = df['Code'].eq(2) & df.groupby("Group")['Code'].shift(-1).eq(4) # current row is 2 and next row in same group is 4
2
m2 = df['Code'].eq(4) & df.groupby("Group")['Code'].shift(-1).eq(2) # current row is 4 and next row in same group is 2
3
4
mask = m1 | m2 # either pair of (2, 4) or (4, 2)
5
6
df[mask | mask.shift()]
7
Result:
Same result:
JavaScript
1
4
1
Group Code
2
1 1 2
3
2 1 4
4