Group Code 1 2 1 2 1 4 1 1 2 4 2 1 2 2 2 3 2 1 2 1 2 3
Within each group there are pairs. In Group 1 for example; the pairs are (2,2),(2,4),(4,1)
I want to filter these pairs based on code numbers 2 AND 4 being present at BOTH ends(not either)
In group 1 for example, only (2,4) will be kept while (2,2) and (4,1) will be filtered out. Excepted Output:
Group Code 1 2 1 4
Advertisement
Answer
You can approach by making 2 boolean masks for current row and next row code in 2 or 4. Then, form the required combination condition of present at BOTH ends(not either)
, as follows:
If you require both 2 AND 4 be present in the pair, then we can make another boolean mask for asserting that these 2 consecutive codes are not equal:
m_curr = df['Code'].isin([2,4]) # current row code is 2 or 4 m_next = df.groupby("Group")['Code'].shift(-1).isin([2,4]) # next row code in same group is 2 or 4 m_diff = df['Code'].ne(df.groupby("Group")['Code'].shift(-1)) # different row codes in current and next row in the same group # current row AND next row code in 2 or 4 AND (2 and 4 both present, i.e. the 2 values in pair are diffrent) mask = m_curr & m_next & m_diff df[mask | mask.shift()]
Result:
Group Code 1 1 2 2 1 4
Another way to do it, may be a little bit simpler for this special case:
m1 = df['Code'].eq(2) & df.groupby("Group")['Code'].shift(-1).eq(4) # current row is 2 and next row in same group is 4 m2 = df['Code'].eq(4) & df.groupby("Group")['Code'].shift(-1).eq(2) # current row is 4 and next row in same group is 2 mask = m1 | m2 # either pair of (2, 4) or (4, 2) df[mask | mask.shift()]
Result:
Same result:
Group Code 1 1 2 2 1 4