Subset-AVG – Finding a subset of List Which Matches Known Rational Number

Question

I've asked this on math overflow and used comments to clarify/overstate my question. I hope it has the intended effect and doesn't come off as jarring. I'm attempting to find what subset of numbers reach a known average. I have a list of known values, negative and possible decimals. They look something like this {-.32,-.64,-.12,.08,-.54,-.43, ...} It's around 50 numbers

Accepted Answer

Here is a solution I adapted from a subSet sum algorithm I posted for another question (here). Since the algorithm loops through potential solution sizes, it was easy to adapt it to search for an average.The iSubSum() function takes 3 parameters: The target average, the list of values and an optional rounding precision parameter. It is a generator so it will produce all possible solutions when used in a loop. You can also get the first solution quickly using the next()function. This should produce results much faster than a brute force approach, especially for large lists.The function is based on a modified version of a subset-sum algorithm that returns solutions as lists of indices. This is intended to distinguish combinations that have duplicate values coming from different indices in the original list.from bisect import bisect_rightfrom itertools import accumulatedef iSubAverage(M,A,P=0): smallSize = 20 smallSums = set() def subSumForSize(S,A,size,failedSums=None): nextSum = A[size-2][2] if size>1 else 0 index = bisect_right([a for a,_,_ in A],S-nextSum) # max element for target A = A[:index] if len(A) < size: return # not enough elements for size if A[size-1][2] > S: return # minimum sum > target maxSum = A[-1][2] if len(A) > size: maxSum -= A[-size-1][2] if maxSum < S: return # maximum sum < target if len(A) <= smallSize and S not in smallSums: return if failedSums is None: failedSums = set() while index >= size: index -= 1 a,i,ca = A[index] if size == 1: if a == S: yield [i] continue c0 = A[index-size][2] if index>size else 0 if ca-c0 < S: break subS = S-a if subS in failedSums: continue # known unreachable sum failed = True for result in subSumForSize(subS,A[:index],size-1,failedSums): yield result+[i] failed = False if failed: failedSums.add(subS) if not A: return if M < 0: M,A = -M,[-a for a in A] # must have positive target offset = max(0,-min(A)) # circumvent negatives (requires loop on sizes) A = sorted([(round(a+offset,P),i) for i,a in enumerate(A)]) cumA = accumulate(a for a,i in A) A = [(a,i,ca) for (a,i),ca in zip(A,cumA)] for a,_,_ in A[:smallSize]: newSums = [a+s for s in smallSums] + [a] smallSums.update(newSums) for size in range(1,len(A)+1): subS = round(M*size,P) if subS != round(M*size,P*2): continue # fractional numerator subS += round(offset*size,P) for result in subSumForSize(subS,A,size): yield resultTo get the actual values, the iSubAvg() function maps indices to the corresponding values in the list:def iSubAvg(M,A,P=0): for iA in iSubAverage(M,A,P): yield sorted([A[i] for i in iA])L = [-.32,-.64,-.12,.08,-.54,-.43]targetL = -0.02for solution in iSubAvg(targetL,L,2): print(solution)# [-0.12, 0.08] (there isn't a solution for -0.03)K = [0.72, 0.69, 0.81, -0.28, 0.6, 0.59, 0.77, 0.46, 0.36, 0.66, 0.88, 0.88, 0.9, -0.24, 0.5, -0.5, 0.46, 0.96, -0.22, -0.8, -0.13, 0.87, 0.78, 0.2] targetK = -0.02for solution in iSubAvg(targetK,K,2): print(solution)# [-0.5, 0.46]# [-0.5, 0.46]# [-0.8, -0.22, 0.96]# [-0.5, -0.28, 0.72]# [-0.28, -0.24, 0.46]# [-0.28, -0.24, 0.46]# [-0.5, -0.24, 0.2, 0.46]# [-0.5, -0.24, 0.2, 0.46]# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]next(iSubAvg(0.165,K,2))# [-0.8, -0.28, -0.24, 0.66, 0.69, 0.96]note that the function returns all combinations including repetitions for duplicate values in the source list. You can filter out these duplicates if you don’t need them

Advertisement

Answer