# Collecting all solutions in a recursive branching algorithm

Moved this from the New to Julia to a more fitting section

I’m trying to translate a divide and conquer algorithm for a modified subset sum problem from Python. A simplified version might look like this: You are given an array of structs with a mass attribute, and a float target. Your task is to return all of the subsets of the input array whose mass adds up to the target, save for some small allowed error.

The model solution is to recursively traverse the entire array, branching off on each element, either including it in the solution and lowering the target by its mass, or skipping it. If you want to find every solution, the naive algorithm just assembles the solutions from both branches:

``````function subset_sum(objects, target, chosen)
if very_close_to_zero(target)
return [chosen]
elseif target < 0 || length(objects) == 0
return []
else
first, rest = objects[1], objects[2:end]
return [
subset_sum(rest target, chosen) ;
subset_sum(rest, target - first.mass, push(chosen, first))
]
end
end
``````

But I assume this will be very allocation heavy. Is there a better Julian-esque solution? In Python I ended up doing basically this, only passing down the accumulator as a tuple because of the immutability, and appending `chosen` to a global result array instead of returning it.

Note that although dynamic programming would help in this model problem, my specific task doesn’t mesh well with it because I have to keep track of a lot of additional stuff, like limited number of skipped objects, each object actually having several possible masses (and I can choose whichever one I need), etc. If you’re interested, I’d be glad to be proven wrong in this regard, but it might be better suited for another discussion.

`objects[2:end]` allocates an array.
`push(chosen, first)` allocates an array.
This is the question of copying vs trailing in recursive search. I would go the trailing way to avoid allocating :
Don’t touch `objects` but keep a variable for the object you are considering. Never allocate a new `chosen` vector, but instead, pass `chosen`, do the first recursive call, add the new object to `chosen`, do the second recursive call, remove the object from chosen and return the solutions. Use in place operators like `push!` and `pop!` to avoid allocating arrays, and at the beginning, use `hintsize!` to give the maximal size of `chosen`.

As your function will mutate chosen, you should call it `subset_sum!` to observe the julia naming convention .

I’m also suspecting that this allocates every time, you should pass a solution accumulator along you recursive calls, but maybe there is another way…

3 Likes

I’me modified my code according to your suggestions, thanks! Now it’s something like the following:

``````function subset_sum(objects, target_mass)
function go!(index, remaining_mass, chosen, solutions)
if remaining_mass ≈ 0
# This copies |chosen| objects
# That's ok, assuming there won't be too many solutions
push!(solutions, objects[chosen])
elseif remaining_mass < 0 || index == length(objects) + 1
solutions
else
go!(index + 1, remaining_mass, chosen, solutions)
go!(
index + 1,
remaining_mass - objects[index].mass,
push!(chosen, index),
solutions,
)
pop!(chosen)
solutions
end
end

chosen = []
sizehint!(chosen, length(objects))
go!(1, target_mass, chosen, [])
end
``````

Looking at the output of `@time`, there still might be some unneeded allocations…

``````> objects40 = map(i -> (mass = i, ), rand(1.0:0.1:5000.0, 40))
> @time Precursor.subset_sum(objects40, 15_000)

24.818297 seconds (504.47 M allocations: 7.517 GiB, 5.86% gc time)
277-element Vector{Any}: [...]
``````

It might also just be all of those pushes and pops. I will investigate this, but I though I might as well copy it here so that you all have an opportunity to think about this, too.

EDIT: It definitely aren’t the pushes and pops (not that I thought that likely in the first place). When I stop modifying chosen — which of course gives me incorrect solutions — the runtime is very similar, and the allocations are basically the same:

`````` 21.103157 seconds (504.53 M allocations: 7.521 GiB, 6.95% gc time, 0.27% compilation time)
``````

EDIT2: When I strip the `chosen` and `solutions` (so that I’m only dry-running the algorithm until I reach the final mass), there’s still a bunch of allocations, although the runtime is a bit quicker.

``````function subset_sum(objects, target_mass)
function go!(index, remaining_mass)
if remaining_mass ≈ 0
return
elseif remaining_mass < 0 || index == length(objects) + 1
return
else
go!(index + 1, remaining_mass)
go!(index + 1, remaining_mass - objects[index].mass)
return
end
end

go!(1, target_mass)
end
``````

with `@time` giving

``````18.510571 seconds (504.52 M allocations: 7.520 GiB, 7.50% gc time, 0.18% compilation time)
``````

Seems like the recursive calls themselves are pretty costly. Converting to a tail-call unfortunately isn’t possible in my case.

EDIT3: Interestignly enough, pulling out the definition of `go!` into a top-level function cuts the execution time by half.

Note that `[]` is an empty `Vector{Any}` which isn’t efficient. You want to use, say,`Float64[]` instead if your array will / should hold floating point values.

Thanks for the tip. The array `chosen` holds integers, and `solutions` holds arrays of some the objects from the input array `objects`, whatever they might be. Does it make sense to specialise the type of `solutions` as well? (And how should I do it properly?)

By the way, note my latest edit on the previous post. Over 7GB of allocations even though I’m only subtracting some numbers.

EDIT: The performance seems to be roughly the same even with the types of both arrays specified. For the `solutions` array, I did it like this:

``````function subset_sum(objects::Objects, target_mass, segments) where Objects
solutions = Objects[]
chosen = UInt16[]
[...]
end
``````

Using the Combinatorics.jl package, you can presumably do something like:

``````using Combinatorics
function subsets_sum(objects, target, tol)
results = Vector{eltype(objects)}[]
for c in combinations(objects)
if abs(sum(o -> o.mass, c) - target) ≤ tol
push!(results, c)
end
end
return results
end
``````

(Though this has suboptimal efficiency because it doesn’t do any dynamic programming or pruning of the combinations, but at least the code is short and clear.)

This is equivalent to `remaining_mass == 0` because you didn’t specify an absolute tolerance.

See the `isapprox` documentation comments on “Note that `x ≈ 0` …”.

Thanks. By the way, any idea why the multi-threaded version is only slighly (13s vs 15s) faster than the single-threaded one?

``````
function subset_sum(objects :: Objects, target_mass, segments) where Objects
# solutions are only written to, can be shared among threads
solutions = Objects[]

# The number of threads is 8, number of objects 40
# chosen is written to and read, has to be unique at least for each thread
chosen = UInt16[]
sizehint!(chosen, length(objects))
# true means that go! has to pick in beginning-th object to chosen
go!(objects, beginning, target_mass, true, chosen, solutions)
end
solutions
end

``````
``````function subset_sum(objects, target_mass)
function go!(objects, index, remaining_mass)
if remaining_mass ≈ 0
return
elseif remaining_mass < 0 || index == length(objects) + 1
return
else
go!(objects, index + 1, remaining_mass)
go!(objects, index + 1, remaining_mass - objects[index])
return
end
end
go!(objects, 1, target_mass)
end
function go!(objects, index, remaining_mass)
if remaining_mass ≈ 0
return
elseif remaining_mass < 0 || index == length(objects) + 1
return
else
go!(objects, index + 1, remaining_mass)
go!(objects, index + 1, remaining_mass - objects[index])
return
end
end

subset_sum2(objects, target_mass) = go!(objects, 1, target_mass)
objects = [rand() for i in 1:27]
target = 17

@time subset_sum(objects, target)
@time subset_sum2(objects, target)

julia> @time subset_sum(objects, target)
16.628330 seconds (268.44 M allocations: 4.000 GiB, 3.06% gc time)

julia> @time subset_sum2(objects, target)
2.857016 seconds
``````

There is indeed a big difference. This is maybe due to this issue, but I’m not sure.

1 Like

Here is what I came up with: first I tried to type the program to better understand it and get rid of `Vector{Any}`. Then I mostly follows @etienne_dg’s advice. The program

``````using BenchmarkTools

function push(v::Vector{T}, e::T) where T
vp = copy(v)
push!(vp, e)
end

function subset_sum1(objects, target, chosen)
if target ≈ 0
return [chosen]
elseif target < 0 || length(objects) == 0
return []
else
first, rest = objects[1], objects[2:end]
return [
subset_sum1(rest, target, chosen) ;
subset_sum1(rest, target - first.mass, push(chosen, first))
]
end
end

function subset_sum2(objects::Vector{T}, target, chosen::Vector{T}) where T
if target ≈ 0
return [chosen]
elseif target < 0 || length(objects) == 0
return Vector{T}[]
else
first, rest = objects[1], objects[2:end]
return Vector{T}[
subset_sum1(rest, target, chosen) ;
subset_sum1(rest, target - first.mass, push(chosen, first))
]
end
end

function subset_sum3(objects::Vector{T}, target, chosen::Vector{T}, solutions::Vector{Vector{T}}) where T
if target ≈ 0
push!(solutions, copy(chosen))
elseif target > 0 && length(objects) > 0
first = pop!(objects)
subset_sum3(objects, target, chosen, solutions)
push!(chosen, first)
subset_sum3(objects, target - first.mass, chosen, solutions)
pop!(chosen)
push!(objects, first)
end
end
function subset_sum3(objects::Vector{T}, target, chosen::Vector{T}) where T
solutions = Vector{T}[]
subset_sum3(reverse(objects), target, chosen, solutions)
solutions
end

objects = map(i -> (mass = i, ), rand(1.0:0.1:5000.0, 30))
ans1 = subset_sum1(objects, 15_000, [])
println(length(ans1))
ans2 = subset_sum2(objects, 15_000, eltype(objects)[])
println(length(ans2))
@assert ans2 == ans1
ans3 = subset_sum3(objects, 15_000, eltype(objects)[])
println(length(ans3))
@assert ans3 == ans2

@btime subset_sum1(objects, 15_000, [])
@btime subset_sum2(objects, 15_000, eltype(objects)[])
@btime subset_sum3(objects, 15_000, eltype(objects)[])

objects = map(i -> (mass = i, ), rand(1.0:0.1:5000.0, 40))
@time ans3 = subset_sum3(objects, 15_000, eltype(objects)[])
println(length(ans3))
``````

shows in the console

``````11
11
11
1.203 s (35966168 allocations: 3.09 GiB)
1.187 s (29971807 allocations: 3.00 GiB)
162.768 ms (20 allocations: 2.56 KiB)
3.679973 seconds (332 allocations: 58.641 KiB)
317
``````

I don’t have a way to test this atm, but does it give back correct solutions? Let’s say on objects being 5 things with masses 5,10,15,20,25, the target being 30? If I’m not mistaken, the `chosen` only get extended and the changes propagate into all future calls of subset_sum

I checked it with `@assertion` for small test cases. Please see the posted code.

I finally got to my Mac. Sure, the impelemntations are consistent with each other, but neither gives the correct results. Consider:

``````> objects_simple = map(i -> (mass = i, ), 5:5:25)
> subset_sum3(objects_simple, 30, eltype(objects_simple)[])
3-element Vector{Vector{NamedTuple{(:mass,), Tuple{Int64}}}}:
[(mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)  …  (mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)]
[(mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)  …  (mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)]
[(mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)  …  (mass = 25,), (mass = 20,), (mass = 25,), (mass = 15,), (mass = 25,), (mass = 20,), (mass = 10,), (mass = 25,), (mass = 20,), (mass = 15,)]
``````

That’s because everything ends up being in `chosen`, without getting removed.

I didn’t see `push` in your MWE and assumed `push!` was meant? I imagine it is something along the lines of

``````function push(v::Vector{T}, e::T) where T
vp = copy(v)
push!(vp, e)
end
``````

then? Will edit the post with a corrected version. Done.