I have a data frame with the following columns:
JavaScript
x
6
1
d = {'lot_no': [1, 2, 3, 4],
2
'part_no': [01345678, 01234567, 01123456, 10123456],
3
'zip_code': [32835, 32835, 32808, 32835]}
4
5
df = pd.DataFrame(data=d)
6
First, I want to check that all 32835 values in the “zip_code” column match to a “part_no” with the following pattern, 01xxxxxx, where the Xs are numbers. Then, I want to make sure all 01xxxxxx part_no correspond to a 32835 “zip_code.” If not, I would like to return a list of “lot_no” for the ones that fail the check, or True if the whole dataframe passes.
In this example, the output should be [3, 4].
Advertisement
Answer
Use boolean mask:
JavaScript
1
8
1
m1 = df['zip_code'].eq('32835')
2
m2 = df['part_no'].str.startswith('01')
3
lot_no = df.loc[~(m1 & m2), 'lot_no'].tolist()
4
print(lot_no)
5
6
# Output
7
[3, 4]
8