I currently have the following line of code in a simulation:
function exchangeBits(n::BitVector, i::Int, j::Int)
# Returns a copy of n with the ith and jth bits exchanged
if i < j
m = n[vcat(1:(i - 1), j, i + 1:(j - 1), i, j + 1:end)]
else
m = n[vcat(1:(j - 1), i, j + 1:(i - 1), j, i + 1:end)]
end
return m
end
Running the full simulation through @profview
gives a Flamegraph with a substantial amount of space taken up by this function, and it looks like the following:
The way I understand this is that the function, in particular the usage of vcat
, is somehow not type-stable and that I should find an alternative method. Is this correct? If so, what exactly is the problem with vcat
in this context?
If I inspect the @code_warntype
it claims to be type stable. But from the @btime
below, I’d believe it’s actually unstable like the profiler suggests. EDIT: The internals of vcat
appear to be unstable in this situation. It looks like a loop in LinearAlgebra._vcat
is to blame.
In any case, I’ll recommend this much faster version:
function exchangeBits2(n::BitVector, i::Int, j::Int)
m = copy(n)
m[j],m[i] = m[i],m[j] # swap entries i and j
return m
end
julia> using BenchmarkTools
julia> z = rand(500) .< 0.5;
julia> @btime exchangeBits($z,$10,$20);
6.440 μs (108 allocations: 7.64 KiB)
julia> @btime exchangeBits2($z,$10,$20);
39.697 ns (2 allocations: 160 bytes)
Dispatch aside, the big issue with the vcat
-indexing version is that vcat
builds a new array. You then use this new array to index n
to make yet-another array. This is terribly wasteful. It’s also expensive to index a BitVector
this way because of how it’s represented in memory (as individual bits, which take some effort to extract from the bytes that store them). Copying the whole thing and making the two little changes you want is much easier.
Opened as issue #48850. That issue currently can be avoided via this definition that replaces i
and j
in the vcat
calls with i:i
and j:j
, so that there aren’t mixtures of Number
and UnitRange
arguments:
function exchangeBits1(n::BitVector, i::Int, j::Int)
# Returns a copy of n with the ith and jth bits exchanged
if i < j
m = n[vcat(1:(i - 1), j:j, i + 1:(j - 1), i:i, j + 1:end)]
else
m = n[vcat(1:(j - 1), i:i, j + 1:(i - 1), j:j, i + 1:end)]
end
return m
end
julia> @btime exchangeBits1($z,$10,$20);
1.240 μs (8 allocations: 4.38 KiB)
Note that this is still much slower than the suggested exchangeBits2
given above.
1 Like