I’m currently trying to run a MonteCarlo simulation without replacement, using np.random.choice, but there are certain items in the list which cannot appear together. For example, if I have a list of five items: RO, MI, VE,NA, SI, and each loop produces a group of four items, RO can appear with VE, MI, or NA, but it cannot appear with SI, on each iteration. So a loop like: [RO,MI,VE,NA] is correct But not: [RO,MI,SI,NA] as SI appears in the same group as RO. This is the code I’m currently using (which is producing the incorrect grouping):
np.random.seed(42) portfolio=[] for i in range(0,10000): port_gar = list(np.random.choice(cities, size=4, replace=False, p=cities_prob)) portfolio.append(port_gar) print(portfolio)
I’m at a loss as to what would do and any help would be appreciated. Thanks!
Advertisement
Answer
I present two solutions, both of which make use of the random
module from the standard library. I recommend this approach over using numpy since you can’t really take advantage of numpy here in any way, and especially because you’re not even ending up with a numpy array at the end (you’re only using numpy for random number generation).
Solution 1: every sample has exactly one city from the “exclusive cities” group
Something you could do is have your cities which don’t play nice in a separate group from the others:
import random random.seed(42) n = 10 # whatever total number of items you want per sample num_samples = 10000 # total number of samples free_cities = [...] # all cities in this group play nicely together exclusive_cities = [...] # cities in this group cannot be grouped in a sample, so we will only ever choose one city at a time from this group portfolio = [] for _ in range(num_samples): free_sample = random.sample(free_cities, k=(n-1)) exclusive_sample = random.sample(exclusive_cities, k=1) portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Here’s a contrived example:
n = 6 num_samples = 10 free_cities = ["AW", "JB", "EX", "HZ", "MZ", "XQ", "KA", "MW", "WZ", "UD"] exclusive_cities = ["RO", "VE", "SI", "NA"] portfolio = [] for _ in range(num_samples): free_sample = random.sample(free_cities, k=(n-1)) exclusive_sample = random.sample(exclusive_cities, k=1) portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Output:
>>> portfolio [['AW', 'EX', 'WZ', 'RO', 'MZ', 'JB'], ['RO', 'MZ', 'KA', 'WZ', 'XQ', 'EX'], ['AW', 'MW', 'VE', 'JB', 'WZ', 'XQ'], ['NA', 'WZ', 'JB', 'MW', 'EX', 'MZ'], ['SI', 'UD', 'AW', 'JB', 'WZ', 'MW'], ['MZ', 'JB', 'UD', 'VE', 'WZ', 'HZ'], ['NA', 'KA', 'UD', 'AW', 'MW', 'WZ'], ['JB', 'NA', 'UD', 'KA', 'WZ', 'MW'], ['MZ', 'RO', 'JB', 'HZ', 'AW', 'XQ'], ['WZ', 'HZ', 'UD', 'JB', 'VE', 'XQ']]
Now something to be aware of is that this will guarantee that every sample has exactly one city from the exclusive_cities
group:
for sample in portfolio: print(set(sample) & set(exclusive_cities))
Output:
# Every sample has one item from the exclusive_cities group {'RO'} {'RO'} {'VE'} {'NA'} {'SI'} {'VE'} {'NA'} {'NA'} {'RO'} {'VE'}
Solution 2: every sample has at most one city from the “exclusive cities” group
If that’s not your desired behavior, you can use an additional “coin flip” to determine if 1 or 0 exclusive cities show up in a given sample with this minor modification to the above logic:
n = 6 num_samples = 10 portfolio = [] for _ in range(num_samples): include = random.choice((0, 1)) free_sample = random.sample(free_cities, k=(n - include)) exclusive_sample = random.sample(exclusive_cities, k=include) portfolio.append(random.sample(free_sample + exclusive_sample, k=n))
Now your samples may or may not include an exclusive city, but still only ever one max:
# An empty set means an exclusive city isn't present in that sample set() set() {'SI'} {'NA'} set() {'RO'} {'SI'} {'VE'} set() {'RO'}