I tried to word the question as simply as possible but I’m new to Python and very bad at logic so I’m having a bit of trouble. Basically I want to know if there’s a cleaner way to count confusion matrices of two 1D arrays of booleans.
Here’s an example:
arr1 = [0, 0, 1, 0, 1, 1] arr2 = [1, 0, 0, 0, 1, 0] tp = fp = tn = fn = 0 for i,p in enumerate(arr1): a = arr2[i] if p & a: tp += 1 if p & ~a: fp += 1 if ~p & ~a: tn += 1 if ~p & a: fn += 1 # This was pointed out to be incorrect (see Mozway's answer below)
I tried this but it just adds more lines and looks arguably worse:
if p == o: if p: tp += 1 else: tn += 1 else: if p: fp += 1 else: fn += 1
I then tried adding nested conditional expressions (I believe these are Python’s version of ternary operators?) but this disgusting monstrosity doesn’t even compile:
(tp += 1 if a else fp += 1) if p else (tn += 1 if ~a else fn += 1)
Any help would be appreciated!
EDIT: Sorry I should have clarified, the result I want is this:
Adding print(tp, fp, tn, fn) would give 1, 2, 2, 1. Simply counting the combinations of each of the booleans in the arrays.
Advertisement
Answer
Use zip
and collections.Counter
:
from collections import Counter c = Counter(zip(arr1, arr2)) tp = c[1,1] fp = c[1,0] tn = c[0,0] fn = c[0,1] print(tp, fp, tn, fn)
output: (1, 2, 2, 1)
counter:
print(c) # Counter({(0, 1): 1, (0, 0): 2, (1, 0): 2, (1, 1): 1})
alternative way to index the counter:
ids = {'tp': (1,1), 'fp': (1,0), 'tn': (0,0), 'fn': (0,1)} c[ids['tp']] # 1
why your first approach failed
bool(~1)
is True
(~1
is -2
), thus giving incorrect counts. ~
is used as not
in a vectorial setup (e.g. with numpy), but not in pure python. You can use 1-x
(not ~x
) to invert an “integer as boolean” (1-0
-> 1
; 1-1
-> 0
).
References on the ~
and &
operators in pure python (you should use not
and and
). Their meaning is different in numpy
.