I have this issue where I would like to shuffle a list, but only do so slightly. Say, I want only a small number of elements to be moved. Is there a simple way to get this done?
Right now the best I can think of is building my own method be hand, but is there some way to use the random
library to do this for me?
Advertisement
Answer
to show what some of these solutions are doing I find it helps to run a monte-carlo algorithm many times and look at the distribution
first a tidied up version of @meta4’s solution as it was the most fleshed out:
from random import randrange def partial_shuffle(l, factor=5): n = len(l) for _ in range(factor): a, b = randrange(n), randrange(n) l[b], l[a] = l[a], l[b]
which we can run many times by doing:
import numpy as np n = 8 orig = list(range(n)) occur = np.zeros((n, n), int) for _ in range(100000): x = orig[:] partial_shuffle(x,1) occur[orig,x] += 1
if we print out the occurrences table as percentages we get:
[[33.5 9.6 9.5 9.4 9.4 9.6 9.5 9.5] [ 9.6 33.2 9.7 9.5 9.6 9.6 9.4 9.4] [ 9.5 9.6 33.2 9.5 9.6 9.5 9.6 9.5] [ 9.5 9.3 9.6 33.4 9.5 9.5 9.5 9.6] [ 9.4 9.6 9.4 9.6 33.3 9.5 9.7 9.5] [ 9.6 9.5 9.6 9.6 9.4 33.3 9.5 9.6] [ 9.4 9.7 9.5 9.5 9.5 9.6 33.2 9.7] [ 9.5 9.5 9.6 9.5 9.7 9.5 9.6 33.2]]
each row represents the probability of the item moving to the column. in this case (when n=8
) the algorithm will tend to leave elements where they were ~33% of the time, and then pick the remainder uniformly
I can then run (a tidied up) version of pjs’s code with:
from random import gauss orderliness = 2 occur = np.zeros((n, n), int) for _ in range(100000): x = sorted(orig, key=lambda i: gauss(i * orderliness, 1)) occur[orig,x] += 1
which gives very different output:
[[91.9 7.9 0.1 0. 0. 0. 0. 0. ] [ 7.9 84.1 7.8 0.1 0. 0. 0. 0. ] [ 0.1 7.8 84.1 7.9 0.1 0. 0. 0. ] [ 0. 0.1 7.9 84.1 7.7 0.1 0. 0. ] [ 0. 0. 0.1 7.7 84.2 7.8 0.1 0. ] [ 0. 0. 0. 0.1 7.9 84.2 7.7 0.1] [ 0. 0. 0. 0. 0.1 7.7 84.2 7.9] [ 0. 0. 0. 0. 0. 0.1 7.9 91.9]]
i.e. items tend to remain close to where they started
this sort of table is great at detecting bias in the distribution, which there doesn’t seem to be evidence of above. but, for example, with Artyom’s solution (shuffle(x, lambda: random() / 5)
) gives the following:
[[ 0. 37.4 0. 0. 0. 16.7 23.8 22.1] [ 0. 0. 100. 0. 0. 0. 0. 0. ] [ 0. 0. 0. 100. 0. 0. 0. 0. ] [ 0. 0. 0. 0. 100. 0. 0. 0. ] [ 1.7 0. 0. 0. 0. 83.3 11.9 3. ] [ 9. 7.4 0. 0. 0. 0. 64.2 19.4] [ 26.7 17.9 0. 0. 0. 0. 0. 55.5] [ 62.6 37.4 0. 0. 0. 0. 0. 0. ]]
which probably isn’t what the OP wanted. the high probability off diagonal represents rotating the array by one element