I have a DataFrame say – df
JavaScript
x
5
1
data = [['00898','22123'],['87','99878'],['22123','66454'],['00898','87'],
2
['2109','312'],['6412','3078'],['6412','3078'],['66454','03158'],['3268','87']]
3
df = pd.DataFrame(data,columns = ['emp_ids','master_emp_ids'])
4
df
5
JavaScript
1
11
11
1
emp_ids master_emp_ids
2
0 00898 22123
3
1 87 99878
4
2 22123 66454
5
3 00898 87
6
4 2109 312
7
5 6412 3078
8
6 6412 3078
9
7 66454 03158
10
8 3268 87
11
Now I want to check whether all the id’s of the column emp_ids
are present in the the column master_emp_ids
or not. So each and every id of emp_ids
should get compared with each and every id of master_emp_ids
. And there should be a column created named status
, where it gives yes if present and no if not. Basically, 87 in the first col is present in the second col so it’s a yes and also to color green that particular cell where it says ‘yes’. So my final DataFrame should be – df
JavaScript
1
5
1
data = [['no','00898','22123'],['yes','87','99878'],['yes','22123','66454'],['no','00898','87'],
2
['no','2109','312'],['no','6412','3078'],['no','6412','3078'],['yes','66454','03158'],['no','3268','87']]
3
df = pd.DataFrame(data,columns = ['Status','emp_ids','master_emp_ids'])
4
df
5
JavaScript
1
11
11
1
status emp_ids master_emp_ids
2
0 no 00898 22123
3
1 yes 87 99878
4
2 yes 22123 66454
5
3 no 00898 87
6
4 no 2109 312
7
5 no 6412 3078
8
6 no 6412 3078
9
7 yes 66454 03158
10
8 no 3268 87
11
please help me to get my desired OP.
Thanks!!!
Advertisement
Answer
Use numpy.where
with Series.isin
for add new last column:
JavaScript
1
13
13
1
df['status'] = np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no')
2
print (df)
3
Status emp_ids master_emp_ids status
4
0 no 00898 22123 no
5
1 yes 87 99878 yes
6
2 yes 22123 66454 yes
7
3 no 00898 87 no
8
4 no 2109 312 no
9
5 no 6412 3078 no
10
6 no 6412 3078 no
11
7 yes 66454 03158 yes
12
8 no 3268 87 no
13
If need first column use DataFrame.insert
:
JavaScript
1
13
13
1
df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no'))
2
print (df)
3
status Status emp_ids master_emp_ids
4
0 no no 00898 22123
5
1 yes yes 87 99878
6
2 yes yes 22123 66454
7
3 no no 00898 87
8
4 no no 2109 312
9
5 no no 6412 3078
10
6 no no 6412 3078
11
7 yes yes 66454 03158
12
8 no no 3268 87
13
EDIT: For coloring is possible use style:
JavaScript
1
13
13
1
def color(val):
2
if val == 'yes':
3
return 'background-color: green'
4
elif val == 'no':
5
return 'background-color: red'
6
else:
7
return ''
8
9
10
df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no'))
11
12
df.style.applymap(color).to_excel('styled.xlsx', engine='openpyxl')
13