Check if vector lies between two vectors element-wise, allocation free

I found some time to profile my code with the intention of reducing memory use and number of allocations. A quick check revealed that the function check_inequalities was the main culprit for the allocations: in my bigger piece of code, I first had a Memory estimate: 1.81 GiB, allocs estimate: 17908475. , which then reduced to a Memory estimate: 1.36 GiB, allocs estimate: 2907475. when I replaced the function by check_inequalities_efficient. Still a lot to do of course, but this is a factor 6 reduction in allocations for a simple one-line change!

I was just wondering if that function is the idiomatic Julia way of doing things, or if there is a more efficient method/idiomatic way of getting an allocation free version of check_inequalities?

using BenchmarkTools, Random
Random.seed!(42)

struct RectangleSpace{T<:Real, U<:Real}
    lower_bounds::Vector{T}
    upper_bounds::Vector{U}
end

check_inequalities(x::Vector{Float64}, rectangle::RectangleSpace) = all(rectangle.lower_bounds .<= x .<= rectangle.upper_bounds)
check_inequalities_efficient(x::Vector{Float64}, rectangle::RectangleSpace) = all(y -> rectangle.lower_bounds[y[1]] <= y[2] <= rectangle.upper_bounds[y[1]], enumerate(x))

function RunBench()
    X = rand(5)
    rectangle = RectangleSpace(fill(0.0, 5), fill(0.5, 5))
    bench = @benchmark check_inequalities($X, $rectangle)
    display(bench)
    bench2 = @benchmark check_inequalities_efficient($X, $rectangle)
    display(bench2)
end

RunBench()

gives

BenchmarkTools.Trial: 10000 samples with 991 evaluations.
 Range (min … max):  41.120 ns …  2.808 ΞΌs  β”Š GC (min … max): 0.00% … 97.79%
 Time  (median):     41.835 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   46.299 ns Β± 73.889 ns  β”Š GC (mean Β± Οƒ):  7.60% Β±  4.78%

  β–„β–‡β–ˆβ–†β–„β–ƒβ–‚β–‚β–   ▁▁▂▂▁                                           ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–‡β–†β–†β–†β–…β–†β–†β–†β–†β–…β–…β–…β–…β–…β–…β–†β–†β–…β–†β–†β–„β–…β–„β–„β–…β–„β–ƒβ–„β–‚β–… β–ˆ
  41.1 ns      Histogram: log(frequency) by time      56.9 ns <

 Memory estimate: 96 bytes, allocs estimate: 3.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  2.834 ns … 23.917 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.958 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.978 ns Β±  0.278 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

   ▁ β–‡ β–ˆβ–‡β–ƒ β–ƒ                                                 β–‚
  β–ƒβ–ˆβ–β–ˆβ–β–ˆβ–ˆβ–ˆβ–β–ˆβ–β–‡β–‡β–ˆβ–β–‡β–β–„β–…β–†β–β–†β–β–…β–„β–„β–β–„β–β–ƒβ–β–„β–β–…β–β–ƒβ–ƒβ–„β–β–β–β–„β–ƒβ–„β–β–…β–β–„β–„β–…β–β–‡β–β–β–„β–†β–β–ˆ β–ˆ
  2.83 ns      Histogram: log(frequency) by time     4.04 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Your check_inequalities_efficient looks fine to me. Also note that it will return false from the moment it encounters one inequality which is not satisfied, whereas check_inequalities will first create a BitVector potentially full of falses, and then return false when encountering the first false within. I.e. the second version is more efficient not only because we get rid of the allocations, but also because we perform fewer comparisons.

I’m personally not too fond of the indexing y[2], so here’s an alternative using an iterator

check_inequalities_efficient_zip(x::Vector{Float64}, rectangle::RectangleSpace) = all(lb <= v <= ub for (lb, v, ub) in zip(rectangle.lower_bounds, x, rectangle.upper_bounds))
julia> RunBench()
BenchmarkTools.Trial: 10000 samples with 985 evaluations.
 Range (min … max):  56.142 ns …  3.489 ΞΌs  β”Š GC (min … max): 0.00% … 96.77%
 Time  (median):     57.056 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   65.460 ns Β± 86.602 ns  β”Š GC (mean Β± Οƒ):  8.44% Β±  6.47%

  β–ˆβ–…β–                    ▃▁                                   ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–‡β–‡β–‡β–„β–†β–„β–„β–…β–…β–ƒβ–„β–ƒβ–„β–„β–ƒβ–†β–ˆβ–ˆβ–ˆβ–†β–†β–…β–…β–„β–…β–…β–β–„β–„β–ƒβ–ƒβ–„β–„β–ƒβ–ƒβ–„β–„β–ƒβ–„β–ƒβ–β–…β–ƒβ–„β–„β–ƒβ–ƒβ–β–„β–β–ƒβ–…β–„ β–ˆ
  56.1 ns      Histogram: log(frequency) by time       146 ns <

 Memory estimate: 96 bytes, allocs estimate: 3.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.300 ns … 65.200 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     4.300 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.426 ns Β±  1.129 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ˆ
  β–ˆβ–ˆβ–‚β–β–‚β–β–β–β–β–‚β–‚β–‚β–β–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–β–‚β–β–β–β–‚β–β–β–β–β–‚β–β–β–β–β–β–β–β–β–β–β–‚β–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–‚ β–‚
  4.3 ns         Histogram: frequency by time         9.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  2.600 ns … 13.800 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     2.700 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   2.800 ns Β±  0.665 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ƒβ–ˆβ–ƒ                                                        ▁
  β–ˆβ–ˆβ–ˆβ–β–„β–ƒβ–β–β–„β–„β–β–„β–„β–β–†β–„β–β–β–ƒβ–„β–β–ƒβ–ƒβ–β–ƒβ–β–β–β–ƒβ–β–β–ƒβ–β–β–β–β–„β–β–β–„β–β–…β–‡β–…β–β–„β–β–β–„β–ƒβ–β–β–ƒβ–β–β–β–ƒβ– β–ˆ
  2.6 ns       Histogram: log(frequency) by time      6.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

(check_inequalities, check_inequalities_efficient, check_inequalities_efficient_zip)

3 Likes

Thank you, that’s even quicker indeed, and certainly more elegant!

1 Like