Skip to content
Advertisement

How to avoid memory error in Python while zipping list with large number of permutations from another list to get unique combinations?

I have the code below to generate all unique combinations between trucks and the permutations of tripares. The way I do it (below) gives me correct results. However it works only when len(tripares) < 10. When len(tripares) > 11 I get a memory error. Is there a way to avoid this memory error?

import itertools
from itertools import cycle

trucks=['A','B','C']
tripares = ['trip1', 'trip2', 'trip3', 'trip4', 'trip5', 'trip6', 'trip7', 'trip8', 'trip9', 'trip10', 'trip11', 'trip12'] 

# Get all permutations of the tripares list
perms = itertools.permutations(tripares,len(tripares))

# Zip the two lists and cycle the trucks list so all trips are matched with a truck
combinations = [list(zip(cycle(trucks), x)) for x in perms]

# Drop duplicates to get unique values
combs_unique = set([tuple(sorted(x)) for x in combinations ])

print(len(combs_unique))
# Should print 34650

Advertisement

Answer

If you just want the number of unique combinations you can simply compute it, without generating all those big sequences.

Let’s take your example input. The first truck will have 4 (no of trips / no of trucks) of the 12 trips: it amounts to 12! / 8! / 4! = 495. Then the second truck will have 4 of the remaining 8 trips: 8! / 4! / 4! = 70. The third truck takes the last 4 trips: 4! / 0! / 4! = 1. So the total count is 495 * 70 * 1 = 34650

You may use a code like this:

from math import factorial

def count_combos(n,k):
    return factorial(n)//factorial(k)//factorial(n-k)

def count_trips(ntrips, ntrucks):
    total = 1
    tt = ntrips//ntrucks
    for x in range(ntrips, tt, -tt):
        total *= count_combos(x, tt)
    return total

count_trips(12,3)
34650

EDIT: get the combinations

If you need the actual combinations, not just their number, you can generate them in your code “manually”, one at a time, via recursion: list the combos for all trips for the first truck, append to each of them the list of combos for the remaining trips for the second truck, and so on until a truck has only one possible combination left. This uses a reasonable amount of memory, and also is not so slow as one may expect:

from itertools import combinations

def list_combos(trips, trucks):
        #print(trips)
        combos = []
        ntrips = len(trips)
        ntrucks = len(trucks)
        tt = ntrips//ntrucks
        if ntrips == tt:
                return trips
        fs_trips = frozenset(trips)
        for c in combinations(fs_trips, tt):
                for subc in list_combos2(fs_trips.difference(c), tt):
                        #print(c, subc)
                        combos.append(c + tuple(subc))
        return combos
    
def list_combos2(fs_trips, tt):
        #print('combos2:', fs_trips)
        combos = []
        ntrips = len(fs_trips)
        if ntrips == tt:
                #print('returning')
                return (fs_trips,)
        for c in combinations(fs_trips, tt):
                for subc in list_combos2(fs_trips.difference(c), tt):
                        #print(c, subc)
                        combos.append(c + tuple(subc))
        return combos

x = list_combos(['t1','t2','t3','t4','t5','t6','t7','t8','t9','t10','t11','t12'], ['a','b','c'])
len(x)
34650

I tested it on 18 trips/3 trucks (17153136 combos) and also on 16/4 (63063000 combos). In the latter case it took around 275 seconds on my PC.

(Note: I’m simply returning a tuple of trips for each combo, without splitting it in ntrucks subtuples with the truck added. This saves some memory and speeds up things a bit; your fuel consumption calculation will simply need to consider that the first ntrips//ntrucks elements belong to the first truck, and so on).

OPTIMIZATION

Since you want to extract the best combo, why generate them all and then do the calculations? Starting from the code above it would be easy to compute the consumption of each combo once you have it, compare it with the current best and only keep track of the new best, without even maintaining the giant list.

This approach may even allow you to do some forward-check, discarding a combo before it is completely defined, and hence dramatically speeding up the execution.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement