I needed to write a weighted version of random.choice (each element in the list has a different probability for being selected). This is what I came up with:
def weightedChoice(choices): """Like random.choice, but each element can have a different chance of being selected. choices can be any iterable containing iterables with two items each. Technically, they can have more than two items, the rest will just be ignored. The first item is the thing being chosen, the second item is its weight. The weights can be any numeric values, what matters is the relative differences between them. """ space = {} current = 0 for choice, weight in choices: if weight > 0: space[current] = choice current += weight rand = random.uniform(0, current) for key in sorted(space.keys() + [current]): if rand < key: return choice choice = space[key] return None
This function seems overly complex to me, and ugly. I’m hoping everyone here can offer some suggestions on improving it or alternate ways of doing this. Efficiency isn’t as important to me as code cleanliness and readability.
Advertisement
Answer
Since version 1.7.0, NumPy has a choice
function that supports probability distributions.
from numpy.random import choice draw = choice(list_of_candidates, number_of_items_to_pick, p=probability_distribution)
Note that probability_distribution
is a sequence in the same order of list_of_candidates
. You can also use the keyword replace=False
to change the behavior so that drawn items are not replaced.