I have a DataFrame say – df
data = [['00898','22123'],['87','99878'],['22123','66454'],['00898','87'], ['2109','312'],['6412','3078'],['6412','3078'],['66454','03158'],['3268','87']] df = pd.DataFrame(data,columns = ['emp_ids','master_emp_ids']) df
emp_ids master_emp_ids 0 00898 22123 1 87 99878 2 22123 66454 3 00898 87 4 2109 312 5 6412 3078 6 6412 3078 7 66454 03158 8 3268 87
Now I want to check whether all the id’s of the column emp_ids
are present in the the column master_emp_ids
or not. So each and every id of emp_ids
should get compared with each and every id of master_emp_ids
. And there should be a column created named status
, where it gives yes if present and no if not. Basically, 87 in the first col is present in the second col so it’s a yes and also to color green that particular cell where it says ‘yes’. So my final DataFrame should be – df
data = [['no','00898','22123'],['yes','87','99878'],['yes','22123','66454'],['no','00898','87'], ['no','2109','312'],['no','6412','3078'],['no','6412','3078'],['yes','66454','03158'],['no','3268','87']] df = pd.DataFrame(data,columns = ['Status','emp_ids','master_emp_ids']) df
status emp_ids master_emp_ids 0 no 00898 22123 1 yes 87 99878 2 yes 22123 66454 3 no 00898 87 4 no 2109 312 5 no 6412 3078 6 no 6412 3078 7 yes 66454 03158 8 no 3268 87
please help me to get my desired OP.
Thanks!!!
Advertisement
Answer
Use numpy.where
with Series.isin
for add new last column:
df['status'] = np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no') print (df) Status emp_ids master_emp_ids status 0 no 00898 22123 no 1 yes 87 99878 yes 2 yes 22123 66454 yes 3 no 00898 87 no 4 no 2109 312 no 5 no 6412 3078 no 6 no 6412 3078 no 7 yes 66454 03158 yes 8 no 3268 87 no
If need first column use DataFrame.insert
:
df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no')) print (df) status Status emp_ids master_emp_ids 0 no no 00898 22123 1 yes yes 87 99878 2 yes yes 22123 66454 3 no no 00898 87 4 no no 2109 312 5 no no 6412 3078 6 no no 6412 3078 7 yes yes 66454 03158 8 no no 3268 87
EDIT: For coloring is possible use style:
def color(val): if val == 'yes': return 'background-color: green' elif val == 'no': return 'background-color: red' else: return '' df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no')) df.style.applymap(color).to_excel('styled.xlsx', engine='openpyxl')