# Minimizing allocations for custom powerset iterator

Just to preface: I realize both Combinatorics.jl and IterTools.jl implement an iterator for powersets. This question is mostly for improving my own Julia skills.

I’m trying to iterate through a powerset as efficiently as possible. I’ve come up with the following scheme:

``````import Base: show, iterate, eltype, length

struct powersetIter{T<:AbstractVector}
set::T
powersetSize::Int
end

powersetIter(set) = powersetIter(set, 2^length(set), falses(length(set)))

function iterate(S::powersetIter, state=0)

if state == S.powersetSize
return nothing
end

if (state & 1<<(i-1))>0
else
end
end

end

length(S::powersetIter) = S.powersetSize
eltype(::Type{powersetIter{T}}) where T = Vector{eltype(T)}
``````

Basically it uses the binary representation of the `state` variable to update what the next subset should be.

However, it uses a lot more memory than I expected because it allocates a new vector every time `S.set[S.mask]` is called.

``````using BenchmarkTools

#just a simple function to loop through powerset
function f(n)
counter = 0
for v in powersetIter(1:n)
counter += 1
end
return counter
end

julia> @btime f(20)
136.901 ms (2097154 allocations: 172.00 MiB)
1048576 #2^20
``````

Is there a way to tell Julia to reuse a pre-allocated space (total size `length(S.set`) ) for all of the iterated subsets?

I compared your iterator with the others you mentioned

``````using BenchmarkTools

#just a simple function to loop through powerset
function f(n)
counter = 0
for _ in powersetIter(1:n)
counter += 1
end
return counter
end

@btime f(20)

using IterTools

function f(n)
counter = 0
for _ in subsets(1:n)
counter += 1
end
return counter
end

@btime f(20)

using Combinatorics

function f(n)
counter = 0
for _ in powerset(1:n)
counter += 1
end
return counter
end

@btime f(20)
``````

yielding

``````  134.698 ms (2097154 allocations: 172.00 MiB) # OP
182.991 ms (4194305 allocations: 332.00 MiB) # IterTools.subsets
87.930 ms (2097176 allocations: 172.00 MiB) # Combinatorics.powerset
``````

So it looks to me you are near optimal with regard to the standard iterator protocol. OTOH maybe you can use a custom ‘inplace’ iterator protocol to reduce allocations?

Edit: I tried it and indeed one can reduce allocations that way, but performance is slightly worse.

I’ve also played around with an “in-place” version, but the reallocation/resizing of vectors always costs one allocation per iteration. I think the remaining speed can be gained by not actually creating a slice or a view of a vector, but instead returning another custom iterator just for returning the “set” of any given iteration.

1 Like

For reference, here is what I have

``````import Base: show, iterate, eltype, length

struct powersetIter{T}
set::T
powersetSize::Int
end

powersetIter(set) = powersetIter(set, 2^length(set), falses(length(set)))

function iterate(S::powersetIter, state=0)

if state == S.powersetSize
return nothing
end

if (state & 1<<(i-1))>0
else
end
end

end

function inplaceiterate(S::powersetIter, item::Vector{T}, state=0) where {T}

if state == S.powersetSize
return nothing
end

empty!(item)
for i in 1:length(S.set)
(state & 1 << (i-1)) > 0 && push!(item, S.set[i])
end

return state+1
end

length(S::powersetIter) = S.powersetSize
eltype(::Type{powersetIter{T}}) where T = Vector{eltype(T)}

using BenchmarkTools

#just a simple function to loop through powerset
function f(n)
counter = 0
iter = powersetIter(1:n)
next = iterate(iter)
while next !== nothing
(item, state) = next
counter += 1
next = iterate(iter, state)
end
return counter
end

@btime f(20)

function f(n)
counter = 0
iter = powersetIter(1:n)
item = Vector{Int}()
state = inplaceiterate(iter, item)
while state !== nothing
counter += 1
state = inplaceiterate(iter, item, state)
end
return counter
end

@btime f(20)
``````

yielding

``````  127.257 ms (2097154 allocations: 172.00 MiB) # iterate
138.315 ms (5 allocations: 576 bytes) #inplaceiterate
``````

Interesting idea.

Edit: Interesting, if we increase memory pressure things start to change. For `f(26)` I see

``````  11.236 s (134217730 allocations: 12.25 GiB) # iterate
9.976 s (5 allocations: 576 bytes) #inplaceiterate
``````

This is some really great input. Based on @goerch benchmarks its possible that an in-place iterator for a powerset is only beneficial for larger sets. I’m guessing because you eventually hit garbage collection.

It looks like Combinatorics.jl uses a different approach to iterate through the powerset, basically they use the fact that iterating through every combination set gives the powerset:

``````# from Combinatorics.powerset
itrs = [combinations(a, k) for k = min:max]
``````

Also worth noting that IterTools.jl has a method for statically sized subsets:

``````using IterTools, Combinatorics
using BenchmarkTools

function f_iter(set)
counter=0
for _ in IterTools.subsets(set, Val{2}())
counter +=1
end

return counter
end

function f_combin(set)
counter=0
for _ in Combinatorics.combinations(set, 2)
counter +=1
end

return counter
end

@btime f_iter(1:20) #219.285 ns (0 allocations: 0 bytes)
@btime f_combin(1:20) #8.667 μs (381 allocations: 20.86 KiB)
``````

Ideally I feel like you could leverage these two ideas, but in practice it doesn’t seem to help that much.

``````function f(n)
counter = 0
for iters in (subsets(1:n, Val{i}()) for i in 0:n)
for sets in iters
counter += 1
end
end
return counter
end

@btime f(26) # 7.265 s (134217780 allocations: 21.26 GiB)
``````

Mostly because the type of `sets` changes with each iteration.

1 Like

One remark (and/or question): I’m not sure if we hit GC with this problem size yet. Wouldn’t `@btime` show something like `gc time` then or is it different from `@time` regarding GC?

Edit: it doesn’t look like `@btime` is reporting `gc time` (just tested with a GC forcing example). Ouch.

2 Likes

Hm, I’m not sure. It looks like by default @btime does not display GC:

``````@btime f(26)
10.134 s (134217730 allocations: 12.25 GiB)

@time f(26)
10.296567 seconds (134.22 M allocations: 12.250 GiB, 7.16% gc time)
``````

GC was actually one of my original concerns for the problem this question came from because I was running things in parallel.

1 Like

Hm, sounds like it is time for me to switch to `@benchmark` then…

Thanks!

1 Like