Getting Varargs typestable

I find myself in the situation of having to evaluate some boolean circuits on a lot of inputs. Well, this is what bitvector is supposed to be good at, but it currently accepts a very limited number of hand-coded circuits in bit_map!. So, to evaluate a very shallow circuit, one would try to do:

const _msk64 = ~UInt64(0)
@inline _div64(l) = l >>> 6
@inline _mod64(l) = l & 63
@inline _msk_end(l::Integer) = _msk64 >>> _mod64(-l)
@inline _msk_end(B::BitArray) = _msk_end(length(B))

#wants:@inline @propagate_inbounds 
@inline function bad_bit_map!(f::F, dest::BitVector, sources::Vararg{BitVector, N}) where {F,N}

    all( Tuple{Bool}(size(dest)==size(source) for source in sources) ) ||
        throw(DimensionMismatch("sizes of dest, source in sources must all match"))
    (isempty(sources) || size(dest)==0) && return dest

      for i in 1:(length(destc)-1)
        dest.chunks[i] = f( Tuple(source.chunks[i]::UInt64 for source::BitVector in sources)...  )::UInt64
    end
    dest.chunks[end] = f( Tuple(source.chunks[end] for source in sources)...  ) & _msk_end(dest)
    dest::BitVector
    
end

Unfortunately, I failed to get the bad_bitmap typestable. Just for convenience, the paraphrased type-stable code from bitarray.jl:

@inline function std_bit_map!(f::F, dest::BitArray, A::BitArray, B::BitArray) where F
    size(A) == size(B) == size(dest) || throw(DimensionMismatch("sizes of dest, A, and B must all match"))
    isempty(dest) && return dest
    for i = 1:(length(dest.chunks)-1)
        dest.chunks[i] = f(A.chunks[i], B.chunks[i])
    end
    dest.chunks[end] = f(A.chunks[end], B.chunks[end]) & _msk_end(dest)
    dest
end

So, what is the right way?

As far as I know there is currently no way to type-assert f::(Varargs{Int64, N}-> Int64), which might be needed for type inference?

Or is the problem in the tuple unpacking and repacking? Is there a reasonable allocation-free type-stable way of unpacking Vararg-tuples, doing something with them, repackaging them and calling something? In principle, the compiler should know that the tuple unpacking and repacking can be done on the stack.

I haven’t parsed your whole example, but can you rewrite what you’re doing in terms of broadcasting? Julia already knows how to broadcast over heterogeneous tuples without allocating:

julia> using BenchmarkTools

julia> x = (1, 2.0, 3, 4.0)
(1, 2.0, 3, 4.0)

julia> @btime 2 .* $x
  1.970 ns (0 allocations: 0 bytes)
(2, 4.0, 6, 8.0)

julia> @btime sum(2 .* $x)
  2.339 ns (0 allocations: 0 bytes)
20.0

julia> @btime sum(2 .* Base.tail($x))
  1.694 ns (0 allocations: 0 bytes)
18.0

julia> using Base.Test

julia> @inferred sum(2 .* Base.tail(x))
18.0

Ok, you’re right, my application broadcasts correctly over int64, so I’ll just drop the BitArrays, which are a convenience wrapper over Vector{Int} anyway.

After reading broadcast.jl, I see that Varargs apparently need to be handled by generated functions. I still don’t understand why type inference failed for my code, but at least my immediate problem is solved, and I now know where to look for examples how this is done right.

Thank you and sorry for the naive question!

1 Like