The builtin function iszero
for BitVector seems unreasonably slow:
julia> a = BitVector(fill(0,10000));
julia> @benchmark iszero($a)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
Range (min … max): 3.750 μs … 8.588 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.775 μs ┊ GC (median): 0.00%
Time (mean ± σ): 3.806 μs ± 95.423 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
▂▁█▁█▁▁▅▁█▁▁▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▆▁▂▁▁▁▁▁▁▁▂▁▁▁▁▁▁▂▁▂▁▁▂▁▂▁▂ ▂
3.75 μs Histogram: frequency by time 4.05 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
I implemented my own version of this
is_zero(a::BitVector) = !any(a)
which is about 100x faster:
julia> using FastDifferentiation #has my implementation of is_zero for BitVector
Precompiling FastDifferentiation...
1 dependency successfully precompiled in 2 seconds. 47 already precompiled.
julia> @benchmark FastDifferentiation.is_zero($a)
BenchmarkTools.Trial: 10000 samples with 993 evaluations.
Range (min … max): 35.549 ns … 74.119 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 35.750 ns ┊ GC (median): 0.00%
Time (mean ± σ): 36.031 ns ± 0.875 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▇█▁▄▃▁▁▂▁▂▁▁▁▁▁▁▁▁▁▁▅▁▃▁▁▁▁▁▁▁▂▂▁▂▂▁▂▂▂▁▂▁▂▁▂▁▁▁▂▁▁▁▂▁▁▁▂▂ ▂
35.5 ns Histogram: frequency by time 39.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
Am I making a benchmarking error? Or is iszero(a::BitVector)
slower than it should be? I’d rather use the builtin iszero
than my own, makes life simpler and less error prone.
Here’s my versioninfo:
Julia Version 1.11.0-rc2
Commit 34c3a63147 (2024-07-29 06:24 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 32 × AMD Ryzen 9 7950X 16-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
Threads: 1 default, 0 interactive, 1 GC (on 32 virtual cores)
Environment:
JULIA_NUM_THREADS = 1
JULIA_EDITOR = code.cmd
Results are the same for 1.10 so this doesn’t seem to be a performance regression:
julia> a = BitVector(fill(0,10000));
julia> @benchmark iszero($a)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
Range (min … max): 3.750 μs … 6.800 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.788 μs ┊ GC (median): 0.00%
Time (mean ± σ): 3.810 μs ± 103.563 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ █
▂██▁██▁▃▃▂▁▂▂▁▂▂▄▁▃▂▁▂▂▂▁▂▂▁▂▂▂▁▂▂▁▂▂▂▁▂▂▁▂▂▂▁▂▂▁▂▂▂▁▂▂▁▂▂▂ ▂
3.75 μs Histogram: frequency by time 4.28 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd4843 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 32 × AMD Ryzen 9 7950X 16-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 16 default, 0 interactive, 8 GC (on 32 virtual cores)
Environment:
JULIA_EDITOR = code.cmd
JULIA_NUM_THREADS = 16