I have a BitVector, and I would like to be able to iterate over the indexes in the vector that contain a 1/true, without any allocations. Here’s my attempt - I’m still getting one allocation.
bits = BitVector([0,1,1,0,1])
dest = zeros(Int64, 5)
function indexes(a::BitVector, b::Vector{Int64})
for idx in (1:length(a))[a]
# do something so it doesn't get optimized away
# what I am actually doing is more complicated than this
b[idx] += 1
end
return b
end
import BenchmarkTools
BenchmarkTools.@btime indexes($bits, $dest);
# 48.938 ns (1 allocation: 80 bytes)
@view(s) doesn’t seem to help either of the two ways I guessed to try it:
function indexes_views1(a::BitVector, b::Vector{Int64})
@views for idx in (1:length(a))[a]
# do something so it doesn't get optimized away
# what I am actually doing is more complicated than this
b[idx] += 1
end
return b
end
function indexes_views2(a::BitVector, b::Vector{Int64})
b[@views (1:length(a))[a]] .+= 1
return b
end
BenchmarkTools.@btime indexes_views1($bits, $dest);
# 47.576 ns (1 allocation: 80 bytes)
BenchmarkTools.@btime indexes_views2($bits, $dest);
# 92.894 ns (2 allocations: 160 bytes)
Even directly indexing with a into b incurs allocations:
function indexes_direct(a::BitVector, b::Vector{Int64})
@views b[a] .+= 1
return b
end
BenchmarkTools.@btime indexes_direct($bits, $dest);
# 110.278 ns (2 allocations: 160 bytes)
On my machine the daisy-chaining solution seems to be the fastest, and zero allocations!
function indexes_chained(a::BitVector, b::Vector{Int64})
for idx in Iterators.map(first,Iterators.filter(last,enumerate(a)))
b[idx] += 1
end
return b
end
BenchmarkTools.@btime indexes_chained($bits, $dest);
# 8.000 ns (0 allocations: 0 bytes)
Thanks a lot for this explanation, especially for the walk through on how to construct a custom iterator. Very insightful, and hopefully useful to other newish users.