I have a data frame with the following columns:
d = {'lot_no': [1, 2, 3, 4], 'part_no': [01345678, 01234567, 01123456, 10123456], 'zip_code': [32835, 32835, 32808, 32835]} df = pd.DataFrame(data=d)
First, I want to check that all 32835 values in the “zip_code” column match to a “part_no” with the following pattern, 01xxxxxx, where the Xs are numbers. Then, I want to make sure all 01xxxxxx part_no correspond to a 32835 “zip_code.” If not, I would like to return a list of “lot_no” for the ones that fail the check, or True if the whole dataframe passes.
In this example, the output should be [3, 4].
Advertisement
Answer
Use boolean mask:
m1 = df['zip_code'].eq('32835') m2 = df['part_no'].str.startswith('01') lot_no = df.loc[~(m1 & m2), 'lot_no'].tolist() print(lot_no) # Output [3, 4]