Skip to content
Advertisement

Combinatory with n picked elements and limited repetitions in Python

actually I’m looking for combinatory with a limited number of repetitions, I know in python in itertools we already have for no repetitions and with ANY repetition, but I can’t found anything for this.

Lets remember, a Combinatory we pick n elements, with a max repetitions, and in the elements, we don’t care the order, (A, B, C) is the same as (B, C, A).

Here an example:

A B, C picking 2, repeated 0:

A, B

A, C

B, C

picking 2, repeated 1:

A, A

A, B

A, C

B, B

B, C

C, C

The functions combinations and combinations_with_replacement doesn’t give this behavior, is like what I’m looking for is the mix of both.

Lets be clear, the number of repetitions is the max, so with ABCD, pick 3 and repeat 2, AAB would be valid.

Is there a lib or module with somthing like this?

Considerations:

I use big list to apply this, so even If I can filter the results from combinations_with_replacement is not very efficient one by one, I need a generator to not overload the ram.

I would like to avoid this method, or some other more efficient:

def Combs2(data, pick, rep):
    for i in itertools.combinations_with_replacement(data, pick):
        s = set(i)
        show = True
        for j in s:
            if i.count(j) > (rep+1):
                show = False
                break
        if show:
            yield i

I test this code, is soo slow that kills every multiprocessing that I’m using, instead using the cores ends using 1….

Edited:

To show the difference with combinations_with_replacement

We have ABCDE, lets pick 3 elements with 1 rep.

here example, AAB, BBC, BCA

AAA would be not valid.

We can’t get this with combinations_with_replacement or combinations.

Advertisement

Answer

There is a one-to-one correspondence between the combinations that you seek and k-tuples of bounded non-negative integers with a given target sum. For example,

AAB, when drawn from ABC consists of 2 A, 1 B and 0 C, so the correspondence is AAB <=> (2,1,0).

One strategy is to write a generator for such tuples and then decode it as it generates to get the output that you want:

#generates multisets of a given size but with bounded multiplicity

#The following generator generates all tuples of
#non-negative integers of length k
#bounded by d and summing to n
#shouldn't be called if n > k*d

def f(n,k,d,path = ()):
    if n == 0:
        yield path + (0,)*k
    elif k == 1:
        yield path + (n,)
    else:
        lower = max(0,n - (k-1)*d)
        upper = min(n,d)
        for i in range(lower,upper+1):
            yield from f(n-i,k-1,d,path + (i,))

def bounded_combos(items,count,bound):
    for t in f(count,len(items),bound):
        combo = []
        for item,count in zip(items,t):
            combo.extend([item]*count)
        yield combo

For example,

>>> for c in bounded_combos('ABC',3,2): print(c)

['B', 'C', 'C']
['B', 'B', 'C']
['A', 'C', 'C']
['A', 'B', 'C']
['A', 'B', 'B']
['A', 'A', 'C']
['A', 'A', 'B']

In terms of the number tuples:

>>> for t in f(3,3,2): print(t)

(0, 1, 2)
(0, 2, 1)
(1, 0, 2)
(1, 1, 1)
(1, 2, 0)
(2, 0, 1)
(2, 1, 0)

As far as how many goes, you can work out a recursive formula for the number of combinations with the key idea that if

a_1 + a_2 + ... + a_k = n

then

a_2 + a_3 + … + a_k = n – a_1

hence the count for a given k and n can be reduced to counts for k-1 and smaller n. A naive implementation of this recursion would involve repeatedly evalauting the same expression, but memoization makes it feasible for large k,n:

def g(n,k,d):
    memo = {} #memoization dictionary
    def h(n,k):
        if (n,k) in memo:
            return memo[(n,k)]
        else:
            if n == 0:
                count = 1
            elif k == 1:
                count = 1
            else:
                lower = max(0,n - (k-1)*d)
                upper = min(n,d)
                count = sum(h(n-i,k-1) for i in range(lower,upper+1))
            memo[(n,k)] = count
            return count
    return h(n,k)

Examples:

>>> g(3,3,2)
7
>>> g(10,5,5)
651
>>> g(100,20,10)
18832730699014127291

I don’t know of any closed-form formula for these. As an experiment, I evaluated

','.join(str(g(n,n,2)) for n in range(1,11))

and pasted it into the search bar of the On-Line Encyclopedia of Integer Sequences and got a very interesting hit: A002426, the central trinomial coefficients, which are discussed here. If you rerun this experiment with different choices of n,k,d, you might stumble upon a nice formula for the overall function.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement