Notebook

In [1]:

import numpy as np

Subset Sum¶

March 27, 2012

From: http://programmingpraxis.com/2012/03/27/subset-sum/

The subset-sum problem asks you to find the subset of a set of integers that sums to a given target; for instance, given the set $\{267, 439, 869, 961, 1000, 1153, 1246, 1598, 1766, 1922\}$ and target $5842$, the subset $\{869, 961, 1000, 1246, 1766\}$ sums to $5842$. This is a well-known NP problem, and the standard solution via dynamic programming takes time $O(n\ 2^n)$. The basic idea is straight forward: generate a list of subsets, compute the sum of each, and return the one that sums to the target.

The dynamic programming solution has the same $O(n\ 2^n)$ time complexity, but builds up the solution in stages. A matrix has $n$ rows and $2n$ columns; the rows are marked with the $n$ input values, the columns are marked with the various subset-sums, and each cell contains a list of items from the n values up to the current row that sums to the column total.

We’ll look at an example...

[omissis]

Your task is to write a function that solves the subset-sum problem. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

A possible solution¶

A Subset Sum Table implementation¶

In [2]:

class SSTable(object):
    """A subset sum table."""
    def __init__(self):
        self.table = None
        self.col = {}
        self.row = {}
    def add(self, i):
        """Add an integer to the subset sum."""
        if i in self.row:
            return
        if self.table is None:
            self.table = np.array([[{i}]], dtype=np.object)
            #self.table.reshape((1, 1))
            self.col.update({i:0})
            self.row.update({i:0})
        else:
            r = self.table[-1, :]
            if not i in list(self.col.keys()):
                r = np.concatenate((r, np.array([set((i,))])))
                self.col.update({i:len(self.col)})
            for sj in self.table[-1, :]:
                sj1 = sj.union({i})
                if not sum(sj1) in list(self.col.keys()):
                    r = np.concatenate((r, np.array([sj1])))
                    self.col.update({sum(sj1):len(self.col)})
            if len(r) - self.table.shape[1] > 0:
                nn = np.empty((self.table.shape[0], len(r) - self.table.shape[1]), dtype=np.object)
                nn.fill(None)
                self.table = np.hstack((self.table, nn))
            self.table = np.vstack((self.table, r))
            self.row.update({i:len(self.row)})
    def get(self, i):
        """Return the set that sums to i.
        
        Raises ValueError if no possible set of items added to the table
        sums to i.
        """
        if not i in self.col:
            raise ValueError("No possible solution was found!")
        else:
            return [x for x in self.table[:, self.col[i]] if x is not None]
    def __str__(self):
        return str(self.table)

Test The example¶

In [3]:

t = SSTable()
for i in (1, -3, 4, 2):
    t.add(i)
print t.get(0)

[set([1, 2, -3])]

Check its correctness with a random test¶

In [5]:

import numpy.random
N = 40 # n of elements in the set
NMAX = 100 # maximum value of the integers in the set
print "Choosing a set of %d random integers in the [-%d, %d) range... " % (N, NMAX, NMAX), 
# No repetitions, we're talking 'sets'
d = []
while len(d) < N:
    j = numpy.random.randint(-NMAX, NMAX)
    if not j in d:
        d.append(j)
d = np.array(d)
print "done."
n = numpy.random.randint(min(2, N), N+1)
print "Choosing at random among the integers selected above %d integers... " % n, 
i = numpy.random.randint(0, N, (n,))
print "done."

print "The selected set is:\n", str(d[i])
s = float(np.sum(d[i]))
print "Which sums to: %d" % s


print "\nBuilding the Sub-set Sum Table... ",
# Let's solve
t = SSTable()
for e in d:
    t.add(int(e))
print "done."
res = t.get(s)[0]
print "The algorithm selected the set %s, which sums to %d." % (str(list(res)), sum(list(res)))

print "Success: %r" % (sum(list(res)) == s)

if not res == set(d[i].tolist()):
    print "Note: the algorithm can only find one of the possibly multiple solutions to the subset sum problem."

Choosing a set of 40 random integers in the [-100, 100) range...  done.
Choosing at random among the integers selected above 12 integers...  done.
The selected set is:
[ 43  62  49 -65  32  40  90 -74  80   9  47 -18]
Which sums to: 295

Building the Sub-set Sum Table...  done.
The algorithm selected the set [32, 49, 68, 53, -26, -11, 61, 69], which sums to 295.
Success: True
Note: the algorithm can only find one of the possibly multiple solutions to the subset sum problem.

In [4]: