I am new to Pandas and learning. I am reading excel to DataFrame and comparing columns and highlight the column that’s not same. For example if Column A is not same as Column B then highlight the Column B. However I have some null values in Column A and Column B. When I execute the code, I don’t want to highlight the null values in Column B. How can I do that? Below is my code:
JavaScript
x
13
13
1
file = Path(path to excel)
2
df = pd.read_excel(file)
3
def color(x):
4
c1 = 'background-color: red'
5
m1 = x['AMOUNT A'] != x['AMOUNT B']
6
m2 = x['AMOUNT C'] != x['AMOUNT D']
7
df = pd.DataFrame('',index=x.index, columns=x.columns)
8
df['AMOUNT B'] = np.select([m1],[c1], default='')
9
df['AMOUNT D'] = np.select([m2],[c1], default='')
10
return df
11
writer = pd.ExcelWriter(path to excel)
12
df.style.apply(color,axis=None).to_excel(writer, 'data', index=False)
13
df before color function:
JavaScript
1
8
1
AMOUNT A AMOUNT B AMOUNT C AMOUNT D
2
0 100.0 200.0 NaN NaN
3
1 200.0 200.0 45.0 25.0
4
2 100.0 500.0 NaN NaN
5
3 NaN NaN NaN NaN
6
4 NaN NaN NaN NaN
7
5 200.0 1.0 NaN NaN
8
Output after running script:
Expected output:
Advertisement
Answer
The issue here is NaN == NaN
will return False
JavaScript
1
9
1
def color(x):
2
c1 = 'background-color: red'
3
m1 = x['AMOUNT A'].fillna('') != x['AMOUNT B'].fillna('')
4
m2 = x['AMOUNT C'].fillna('') != x['AMOUNT D'].fillna('')
5
df = pd.DataFrame('',index=x.index, columns=x.columns)
6
df['AMOUNT B'] = np.select([m1],[c1], default='')
7
df['AMOUNT D'] = np.select([m2],[c1], default='')
8
return df
9
More Info
JavaScript
1
3
1
np.nan == np.nan
2
Out[527]: False
3