Skip to content

Commit f4d08a9

Browse files
author
Dan Shea
committed
Initial commit
0 parents  commit f4d08a9

5 files changed

+864
-0
lines changed

README.md

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Solving the Subset Sum Problem with Heap-Ordered Subset Trees
2+
3+
A heap-ordered tree structure consisting of n-length subsets of a set. To construct a subset tree (i.e. a collection of subset heaps) of order n from set S.
4+
5+
## Motivation
6+
7+
Solutions for the subset sum problem exist in which min-heap trees may be wielded to order all subsets by their sum and then traversed appropriately. By applying a binary search on the list of all sum-ordered subsets, a solution may be found relatively quickly. However, this approach only works for subsets in which all values are positive. The provided code extends this approach to include sets with all values, both positive and negative, by employing a new data structure termed the subset tree.
8+
9+
## Subset Tree Generation Algorithm
10+
11+
1. Sort the set.
12+
2. Set the root node to the n smallest elements of S.
13+
3. The first child is the subset of the parent node with the greatest element replaced by the next greatest element in S. If no such greater element exists, then you have reached the end of S and this child node should not be generated.
14+
4. Let the variable i be equal to the second greatest index in the subset of the parent node. The next child is the subset
15+
of the parent node with the element at index i replaced by the next greatest element in S. If that element already exists
16+
in the subset, then increment the conflicting element with its next greatest element in S. Repeat with any additional conflicts until none exist or there is no greater element in S with which to increment. If the latter is the case, then you have reached the end of S and this child node should not be generated.
17+
5. Decrement i by 1 and repeat Step 4 for all subsequent children until i is less than the smallest index at which the parent incremented its subset value.
18+
6. The algorithm terminates when no additional incrementations can be made (i.e. when the greatest element in the subset is equal to the greatest element in S).
19+
20+
You will now have a min-heap of all subsets of length n from set S. For a subset heap of order n, the total number of nodes
21+
is bounded at (N choose n), where N is the total number of elements in the input set. Each node has at most N children and the height of the tree is at most N.
22+
23+
## Subset Sum Solution Algorithm
24+
25+
Given an input set I of length N and a target sum S, the algorithm, from start to finish, may be defined as follows:
26+
27+
1. Sort I.
28+
2. Offset each element in I by the absolute value of the least element in I plus one. Store this offset value.
29+
3. Set a variable O to 1. This is the order of the virtual subset heap being searched.
30+
4. While O is less than or equal to N and a subset that sums to S + (offset * O) has not been found:
31+
1. Perform a binary search for S + (offset ∗ O) on the virtual subset heap of order O.
32+
2. If a subset is found, then terminate the loop.
33+
3. If a subset is not found, then increment O by 1 and repeat Step 4a. Do this until a subset is found or O is greater than N.
34+
5. If a subset has been found, then subtract the offset value from each element in the located subset. This scaled subset is the returned value. If no subset has been found, then indicate so to the user.
35+
36+
## Algorithmic Complexity
37+
38+
As with the heap-ordered binary tree, the binary search is done in time logarithmic to the size of the power set and each lookup is done in a time dependent on the k value. Since the number of child nodes at each deleted node in the subset tree can be N, the heapification step requires additional computation, with the overhead raising it from O(log k) to O(N log k); the lookup takes O(N k log k) time. Since this O(N^2 k log k) operation must be run N times in the worst case where no subset exists, its overall complexity is O(N^3 k log k).
39+
40+
## A Simple Example
41+
42+
Take the example set of {−7, −3, −2, 5, 8} with a target sum of 0. The algorithm is executed as follows:
43+
44+
1. {−7, −3, −2, 5, 8} is sorted. (In this case, it was already sorted.)
45+
2. An offset of 8 is applied to {−7, −3, −2, 5, 8}, producing a scaled set of {1, 5, 6, 13, 16}.
46+
3. O is set to 1.
47+
4. A binary search for S + (offset * O), or 8, is performed on the virtual subset heap of order O, or 1.
48+
5. Since no subset is found, O is incremented by 1. As O is not greater than the set size of 5, we may continue our search.
49+
6. A binary search for S + (offset * O), or 16, is performed on the virtual subset heap of order O, or 2.
50+
7. Since no subset is found, O is incremented by 1. As O is not greater than the set size of 5, we may continue our search.
51+
8. A binary search for S + (offset * O), or 24, is performed on the virtual subset heap of order O, or 3.
52+
9. The subset {5, 6, 13} is returned. Subtract the offset from each element to produce the original subset {−3, −2, 5}.
53+
10. Return {−3, −2, 5}.
54+
55+
## Java Classes
56+
57+
The provided code will generate a subset heap of a specified order from hardcoded input.
58+
59+
* The TreeNode class has been taken from the Github repository [GenericTree](https://github.com/vivin/GenericTree) and maintains a generic node in the tree.
60+
* The SubsetTree class is the executable class with the order and input set variables. This class specifies a hardcoded k value for finding the kth subset. This generates the whole tree.
61+
* The VirtualSubsetTree class will generate all child nodes as needed. This generates the tree up to the kth subset and is done in-place, preventing generation of the complete subset tree.
62+
* The SubsetSum class is the class which incorporates VirtualSubsetTree with the complete algorithm in order to solve the subset sum problem. An example set and target sum has been provided in the class and may be modified as desired.
63+
64+
## Further Information
65+
66+
A complete explanation is available on the arXiv at: http://arxiv.org/abs/1512.01727
67+
68+
This paper does a more detailed job of explaining the motivation for the algorithm, the theory behind it, and its complete implementation.

SubsetSum.java

+289
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
/**
2+
* Building a Subset Tree (a collection of Subset Heaps) of size n from set S:
3+
* 1. Sort the set
4+
* 2. Set the root node to the n smallest elements of S
5+
* 3. The first child is the subset of the parent node with the greatest element replaced by
6+
* the next greatest element in S. If no such greater element exists, then you have reached
7+
* the end of S and this child node should not be generated.
8+
* 4. Let the variable i be equal to the second greatest index in the subset of the parent
9+
* node. The next child is the subset of the parent node with the element at index i
10+
* replaced by the next greatest element in S. If that element already exists in the subset,
11+
* then increment the conflicting element with its next greatest element in S. Repeat with
12+
* any additional conflicts until none exist or there is no greater element in S with which
13+
* to increment. If the latter is the case, then you have reached the end of S and this
14+
* child node should not be generated.
15+
* 5. Decerement i by 1 and repeat Step 4 for all subsequent children until i is less than the
16+
* smallest index at which the parent incremented its subset value.
17+
* 6. The algorithm terminates when no additional incrementations can be made (i.e. when the
18+
* greatest element in the subset is equal to the greatest element in S).
19+
* You will now have a min-heap of all subsets of length n from set S.
20+
*
21+
* This class, SubsetSum, does not build the full Subset Tree, instead constructing it on demand
22+
* when searching for the k'th smallest element in the tree. All elements are scaled to be positive,
23+
* with the necessary offset that was applied to each element being stored in a variable. Each
24+
* n-sized Subset Tree is then searched for the target sum. The target sum is appropriately
25+
* incremented for each n-sized Subset Tree.
26+
*
27+
* @author Dan Shea
28+
*/
29+
30+
import java.util.ArrayList;
31+
import java.util.Collections;
32+
import java.util.Comparator;
33+
import java.util.List;
34+
import java.util.PriorityQueue;
35+
36+
public class SubsetSum {
37+
38+
public ArrayList<Integer> input = new ArrayList<Integer>();
39+
public ArrayList<Integer> scaledInput = new ArrayList<Integer>();
40+
public TreeNode<List<Integer>> tree;
41+
42+
public SubsetSum() {
43+
input.add( -7 );
44+
input.add( -3 );
45+
input.add( -2 );
46+
input.add( 0 );
47+
input.add( 1 );
48+
input.add( 5 );
49+
input.add( 8 );
50+
input.add( 17 );
51+
input.add( 21 );
52+
53+
// The sum for which we want to find an appropriate subset
54+
int target = 50;
55+
56+
// 1. Sort the set
57+
Collections.sort( input );
58+
59+
// Scale the set so that there are no negative elements, with the offset
60+
// being stored in a variable
61+
int offset = 0;
62+
if ( input.get( 0 ) <= 0 ) {
63+
offset = Math.abs( input.get( 0 ) ) + 1;
64+
for ( int i = 0; i < input.size( ); ++i ) {
65+
scaledInput.add( input.get( i ) + offset );
66+
}
67+
}
68+
69+
// Search all n-sized min-heaps from set S for the target sum
70+
int scaledTarget;
71+
boolean targetFound = false;
72+
List<Integer> result = new ArrayList<Integer>( );
73+
int resultSum = 0;
74+
for ( int n = 1; n <= input.size( ); ++n ) {
75+
// Scale the target to be searchable in the n-sized Subset Tree
76+
scaledTarget = target + ( offset * n );
77+
78+
// Now generate the k'th smallest subset
79+
tree = buildTree( scaledInput, n );
80+
result = binarySearch( tree, 0, choose( input.size( ), n ), scaledTarget );
81+
if ( result == null ) {
82+
continue;
83+
}
84+
if ( sum( result ) == scaledTarget ) {
85+
targetFound = true;
86+
break;
87+
}
88+
}
89+
90+
// Output the result
91+
if ( targetFound ) {
92+
for ( int i = 0; i < result.size( ); ++i ) {
93+
result.set( i, result.get( i ) - offset );
94+
}
95+
System.out.println( "Subset that sums to the target sum has been found!" );
96+
printList( result );
97+
}
98+
else {
99+
System.out.println( "No subset sums to the target sum " + target );
100+
}
101+
}
102+
103+
public List<Integer> binarySearch( TreeNode<List<Integer>> tree, int lowerBound,
104+
int upperBound, int target ) {
105+
if ( lowerBound > upperBound ) {
106+
return null;
107+
}
108+
int position = Math.round( ( lowerBound + upperBound ) / 2.0f );
109+
if ( position == 0 ) {
110+
return null;
111+
}
112+
List<Integer> kthMin = findKthMin( tree, position );
113+
int sum = sum( kthMin );
114+
if ( lowerBound == upperBound ) {
115+
if ( sum == target ) {
116+
return kthMin;
117+
}
118+
else {
119+
return null;
120+
}
121+
}
122+
else {
123+
if ( sum == target ) {
124+
return kthMin;
125+
}
126+
else if ( sum < target ) {
127+
if ( lowerBound == position ) {
128+
position += 1;
129+
}
130+
return binarySearch( tree, position, upperBound, target );
131+
}
132+
else {
133+
if ( upperBound == position ) {
134+
position -= 1;
135+
}
136+
return binarySearch( tree, lowerBound, position, target );
137+
}
138+
}
139+
}
140+
141+
public TreeNode<List<Integer>> buildTree( ArrayList<Integer> list, int n ) {
142+
TreeNode<List<Integer>> tree = new TreeNode<List<Integer>>();
143+
144+
ArrayList<Integer> idcs = new ArrayList<Integer>();
145+
for ( int i = 0; i < n; ++i ) {
146+
idcs.add( i );
147+
}
148+
149+
// 2. Set the root node to the n smallest elements of S
150+
tree.setData( list.subList( 0, n ) );
151+
tree.setIndices( idcs );
152+
tree.setLimit( 0 );
153+
154+
return tree;
155+
}
156+
157+
/*
158+
* In this case, the algorithm terminates after the first layer of chilren are created;
159+
* the recursive case is not run
160+
*/
161+
public List<TreeNode<List<Integer>>> buildChildren( List<Integer> list, List<Integer> indices, int limit ) {
162+
ArrayList<TreeNode<List<Integer>>> children = new ArrayList<TreeNode<List<Integer>>>();
163+
164+
// 3. The first child is the subset of the parent node with the greatest element replaced by
165+
// the next greatest element in S. If no such greater element exists, then you have reached
166+
// the end of S and this child node should not be generated.
167+
// 4. Let the variable i be equal to the second greatest index in the subset of the parent
168+
// node. The next child is the subset of the parent node with the element at index i
169+
// replaced by the next greatest element in S. If that element already exists in the subset,
170+
// then increment the conflicting element with its next greatest element in S. Repeat with
171+
// any additional conflicts until none exist or there is no greater element in S with which
172+
// to increment. If the latter is the case, then you have reached the end of S and this
173+
// child node should not be generated.
174+
// 5. Decerement i by 1 and repeat Step 4 for all subsequent children until i is less than the
175+
// smallest index at which the parent incremented its subset value.
176+
for ( int i = list.size( ) - 1; i >= limit; --i ) {
177+
// 6. The algorithm terminates when no additional incrementations can be made (i.e. when the
178+
// greatest element in the subset is equal to the greatest element in S).
179+
if ( indices.get( indices.size( ) - 1 ) == input.size( ) - 1 ) {
180+
continue;
181+
}
182+
TreeNode<List<Integer>> child = new TreeNode<List<Integer>>();
183+
ArrayList<ArrayList<Integer>> tmpArr = incrementList( list, indices, i );
184+
child.setData( tmpArr.get( 0 ) );
185+
child.setIndices( tmpArr.get( 1 ) );
186+
child.setLimit( i );
187+
children.add( child );
188+
}
189+
190+
return children;
191+
}
192+
193+
public ArrayList<ArrayList<Integer>> incrementList( List<Integer> list, List<Integer> indices, int idx ) {
194+
if ( list.size( ) < 1 ) {
195+
return null;
196+
}
197+
ArrayList<Integer> newList = deepCopy( list );
198+
ArrayList<Integer> newIndices = deepCopy( indices );
199+
if ( list.size( ) == 1 ) {
200+
if ( idx < list.size( ) ) {
201+
newList.set( idx, scaledInput.get( indices.get( idx ) + 1 ) );
202+
newIndices.set( idx, indices.get( idx ) + 1 );
203+
}
204+
else {
205+
return null;
206+
}
207+
}
208+
else {
209+
do {
210+
newList.set( idx, scaledInput.get( indices.get( idx ) + 1 ) );
211+
newIndices.set( idx, indices.get( idx ) + 1 );
212+
++idx;
213+
} while ( idx < list.size( ) && newIndices.get( idx - 1 ) == indices.get( idx ) );
214+
if ( newIndices.get( newIndices.size( ) - 1 ) == newIndices.get( newIndices.size( ) - 2 ) ) {
215+
return null;
216+
}
217+
}
218+
ArrayList<ArrayList<Integer>> retVal = new ArrayList<ArrayList<Integer>>();
219+
retVal.add( newList );
220+
retVal.add( newIndices );
221+
return retVal;
222+
}
223+
224+
public ArrayList<Integer> deepCopy( List<Integer> list ) {
225+
ArrayList<Integer> newList = new ArrayList<Integer>();
226+
for ( int i = 0; i < list.size(); ++i ) {
227+
newList.add( list.get( i ) );
228+
}
229+
return newList;
230+
}
231+
232+
@SuppressWarnings("unchecked")
233+
public List<Integer> findKthMin( TreeNode<List<Integer>> tree, int k ) {
234+
Comparator<TreeNode<List<Integer>>> comparator = new NodeComparator();
235+
PriorityQueue<TreeNode<List<Integer>>> toVisit = new PriorityQueue<TreeNode<List<Integer>>>( 11, comparator );
236+
TreeNode<List<Integer>> root = new TreeNode<List<Integer>>( tree.getData( ) );
237+
root.setIndices( tree.getIndices( ) );
238+
root.setLimit( tree.getLimit( ) );
239+
root.setChildren( tree.getChildren( ) );
240+
toVisit.add( root );
241+
ArrayList<TreeNode<List<Integer>>> smallestNodes = new ArrayList<TreeNode<List<Integer>>>( );
242+
while ( smallestNodes.size( ) < k ) {
243+
TreeNode<List<Integer>> node = toVisit.poll( );
244+
List<TreeNode<List<Integer>>> children = buildChildren( node.getData( ), node.getIndices( ),
245+
node.getLimit( ) );
246+
for ( TreeNode<List<Integer>> child : children ) {
247+
TreeNode<List<Integer>> newChild = new TreeNode<List<Integer>>( child.getData( ) );
248+
newChild.setIndices( child.getIndices( ) );
249+
newChild.setLimit( child.getLimit( ) );
250+
newChild.setChildren( child.getChildren( ) );
251+
toVisit.add( newChild );
252+
}
253+
smallestNodes.add( node );
254+
}
255+
return smallestNodes.get( k - 1 ).getData( );
256+
}
257+
258+
public int sum( List<Integer> node ) {
259+
int sum = 0;
260+
for ( Integer n : node ) {
261+
sum += n;
262+
}
263+
return sum;
264+
}
265+
266+
public void printList( List<Integer> list ) {
267+
for ( int i = 0; i < list.size( ); ++i ) {
268+
System.out.print( list.get( i ) + "\t" );
269+
}
270+
System.out.println( );
271+
}
272+
273+
public static int factorial( int a ) {
274+
int answer = 1;
275+
for( int i = 1; i <= a; ++i) {
276+
answer *= i;
277+
}
278+
return answer;
279+
}
280+
281+
public static int choose( int n, int k ) {
282+
return ( factorial( n ) / ( factorial( k ) * factorial( n - k ) ) );
283+
}
284+
285+
public static void main(String[] args) {
286+
new SubsetSum( );
287+
}
288+
289+
}

0 commit comments

Comments
 (0)