Just to preface: I realize both Combinatorics.jl and IterTools.jl implement an iterator for powersets. This question is mostly for improving my own Julia skills.
I’m trying to iterate through a powerset as efficiently as possible. I’ve come up with the following scheme:
import Base: show, iterate, eltype, length
struct powersetIter{T<:AbstractVector}
set::T
powersetSize::Int
mask::BitVector
end
powersetIter(set) = powersetIter(set, 2^length(set), falses(length(set)))
iterate(S::powersetIter) = (S.set[S.mask],0)
function iterate(S::powersetIter, state=0)
if state == S.powersetSize
return nothing
end
for i in eachindex(S.mask)
if (state & 1<<(i-1))>0
S.mask[i] = true
else
S.mask[i] = false
end
end
return (S.set[S.mask], state+1)
end
length(S::powersetIter) = S.powersetSize
eltype(::Type{powersetIter{T}}) where T = Vector{eltype(T)}
Basically it uses the binary representation of the state variable to update what the next subset should be.
However, it uses a lot more memory than I expected because it allocates a new vector every time S.set[S.mask] is called.
using BenchmarkTools
#just a simple function to loop through powerset
function f(n)
counter = 0
for v in powersetIter(1:n)
counter += 1
end
return counter
end
julia> @btime f(20)
136.901 ms (2097154 allocations: 172.00 MiB)
1048576 #2^20
Is there a way to tell Julia to reuse a pre-allocated space (total size length(S.set) ) for all of the iterated subsets?
I compared your iterator with the others you mentioned
using BenchmarkTools
#just a simple function to loop through powerset
function f(n)
counter = 0
for _ in powersetIter(1:n)
counter += 1
end
return counter
end
@btime f(20)
using IterTools
function f(n)
counter = 0
for _ in subsets(1:n)
counter += 1
end
return counter
end
@btime f(20)
using Combinatorics
function f(n)
counter = 0
for _ in powerset(1:n)
counter += 1
end
return counter
end
@btime f(20)
yielding
134.698 ms (2097154 allocations: 172.00 MiB) # OP
182.991 ms (4194305 allocations: 332.00 MiB) # IterTools.subsets
87.930 ms (2097176 allocations: 172.00 MiB) # Combinatorics.powerset
So it looks to me you are near optimal with regard to the standard iterator protocol. OTOH maybe you can use a custom ‘inplace’ iterator protocol to reduce allocations?
Edit: I tried it and indeed one can reduce allocations that way, but performance is slightly worse.
I’ve also played around with an “in-place” version, but the reallocation/resizing of vectors always costs one allocation per iteration. I think the remaining speed can be gained by not actually creating a slice or a view of a vector, but instead returning another custom iterator just for returning the “set” of any given iteration.
import Base: show, iterate, eltype, length
struct powersetIter{T}
set::T
powersetSize::Int
mask::BitVector
end
powersetIter(set) = powersetIter(set, 2^length(set), falses(length(set)))
iterate(S::powersetIter) = (S.set[S.mask],0)
function iterate(S::powersetIter, state=0)
if state == S.powersetSize
return nothing
end
for i in eachindex(S.mask)
if (state & 1<<(i-1))>0
S.mask[i] = true
else
S.mask[i] = false
end
end
return (S.set[S.mask], state+1)
end
function inplaceiterate(S::powersetIter, item::Vector{T}, state=0) where {T}
if state == S.powersetSize
return nothing
end
empty!(item)
for i in 1:length(S.set)
(state & 1 << (i-1)) > 0 && push!(item, S.set[i])
end
return state+1
end
length(S::powersetIter) = S.powersetSize
eltype(::Type{powersetIter{T}}) where T = Vector{eltype(T)}
using BenchmarkTools
#just a simple function to loop through powerset
function f(n)
counter = 0
iter = powersetIter(1:n)
next = iterate(iter)
while next !== nothing
(item, state) = next
counter += 1
next = iterate(iter, state)
end
return counter
end
@btime f(20)
function f(n)
counter = 0
iter = powersetIter(1:n)
item = Vector{Int}()
state = inplaceiterate(iter, item)
while state !== nothing
counter += 1
state = inplaceiterate(iter, item, state)
end
return counter
end
@btime f(20)
yielding
127.257 ms (2097154 allocations: 172.00 MiB) # iterate
138.315 ms (5 allocations: 576 bytes) #inplaceiterate
Interesting idea.
Edit: Interesting, if we increase memory pressure things start to change. For f(26) I see
11.236 s (134217730 allocations: 12.25 GiB) # iterate
9.976 s (5 allocations: 576 bytes) #inplaceiterate
This is some really great input. Based on @goerch benchmarks its possible that an in-place iterator for a powerset is only beneficial for larger sets. I’m guessing because you eventually hit garbage collection.
It looks like Combinatorics.jl uses a different approach to iterate through the powerset, basically they use the fact that iterating through every combination set gives the powerset:
# from Combinatorics.powerset
itrs = [combinations(a, k) for k = min:max]
Also worth noting that IterTools.jl has a method for statically sized subsets:
using IterTools, Combinatorics
using BenchmarkTools
function f_iter(set)
counter=0
for _ in IterTools.subsets(set, Val{2}())
counter +=1
end
return counter
end
function f_combin(set)
counter=0
for _ in Combinatorics.combinations(set, 2)
counter +=1
end
return counter
end
@btime f_iter(1:20) #219.285 ns (0 allocations: 0 bytes)
@btime f_combin(1:20) #8.667 μs (381 allocations: 20.86 KiB)
Ideally I feel like you could leverage these two ideas, but in practice it doesn’t seem to help that much.
function f(n)
counter = 0
for iters in (subsets(1:n, Val{i}()) for i in 0:n)
for sets in iters
counter += 1
end
end
return counter
end
@btime f(26) # 7.265 s (134217780 allocations: 21.26 GiB)
Mostly because the type of sets changes with each iteration.
One remark (and/or question): I’m not sure if we hit GC with this problem size yet. Wouldn’t @btime show something like gc time then or is it different from @time regarding GC?
Edit: it doesn’t look like @btime is reporting gc time (just tested with a GC forcing example). Ouch.