Iterator for indexes of true values in BitVector without allocations

I have a BitVector, and I would like to be able to iterate over the indexes in the vector that contain a 1/true, without any allocations. Here’s my attempt - I’m still getting one allocation.

bits = BitVector([0,1,1,0,1])
dest = zeros(Int64, 5)

function indexes(a::BitVector, b::Vector{Int64})
	for idx in (1:length(a))[a]
		# do something so it doesn't get optimized away
		# what I am actually doing is more complicated than this
		b[idx] += 1
	end
	return b
end

import BenchmarkTools
BenchmarkTools.@btime indexes($bits, $dest);
# 48.938 ns (1 allocation: 80 bytes)

(1:length(a))[a] is an allocation. Does @views help?

@Oscar_Smith, do you mean like: b[@views (1:length(a))[a]]?

I can do this daisy-chaining a few Iterators constructs:

julia> b = rand(100) .< 0.05; findall(b)
4-element Vector{Int64}:
 12
 63
 70
 85

julia> Iterators.map(first,Iterators.filter(last,enumerate(b))) |> collect
4-element Vector{Int64}:
 12
 63
 70
 85

But note that (depending on your setting) this may not be faster than a loop over the (allocated) result of findall:

julia> using BenchmarkTools

julia> @btime Iterators.map(first,Iterators.filter(last,enumerate($b))) |> sum
  100.530 ns (0 allocations: 0 bytes)
230

julia> @btime findall($b) |> sum
  38.004 ns (1 allocation: 96 bytes)
230

I think your most performant option would be to make something using findnext. Like

i = findfirst(b)
while !isnothing(i)
	@show i
	i = findnext(b,i+oneunit(i))
end

You could package this into an iterator if you really wanted.

EDIT: here it is:

struct Findall{I}
	itr::I
end
Base.IteratorSize(::Findall) = Base.SizeUnknown()
Base.eltype(x::Findall) = typeof(firstindex(x.itr))
function Base.iterate(x::Findall, current=firstindex(x.itr))
	next = findnext(x.itr, current)
	isnothing(next) && return nothing
	return (next, next + oneunit(next))
end
julia> Findall(b) |> collect
4-element Vector{Int64}:
 12
 63
 70
 85

julia> @btime Findall($b) |> sum
  25.904 ns (0 allocations: 0 bytes)
230

You could easily add a predicate function, too.

3 Likes

@view(s) doesn’t seem to help either of the two ways I guessed to try it:


function indexes_views1(a::BitVector, b::Vector{Int64})
	@views for idx in (1:length(a))[a]
		# do something so it doesn't get optimized away
		# what I am actually doing is more complicated than this
		b[idx] += 1
	end
	return b
end

function indexes_views2(a::BitVector, b::Vector{Int64})
	b[@views (1:length(a))[a]] .+= 1
	return b
end


BenchmarkTools.@btime indexes_views1($bits, $dest);
#  47.576 ns (1 allocation: 80 bytes)
BenchmarkTools.@btime indexes_views2($bits, $dest);
#  92.894 ns (2 allocations: 160 bytes)

Even directly indexing with a into b incurs allocations:

function indexes_direct(a::BitVector, b::Vector{Int64})
	@views b[a] .+= 1
	return b
end
BenchmarkTools.@btime indexes_direct($bits, $dest);
# 110.278 ns (2 allocations: 160 bytes)

On my machine the daisy-chaining solution seems to be the fastest, and zero allocations!

function indexes_chained(a::BitVector, b::Vector{Int64})
	for idx in Iterators.map(first,Iterators.filter(last,enumerate(a)))
		b[idx] += 1
	end
	return b
end

BenchmarkTools.@btime indexes_chained($bits, $dest);
# 8.000 ns (0 allocations: 0 bytes)

Thanks a lot for this explanation, especially for the walk through on how to construct a custom iterator. Very insightful, and hopefully useful to other newish users.