Hard to beat Numba / JAX loop for generating Mandelbrot

julia> @benchmark run4(2000,3000)
BenchmarkTools.Trial: 187 samples with 1 evaluation.
 Range (min … max):  25.048 ms … 29.405 ms  ┊ GC (min … max): 0.00% … 2.95%
 Time  (median):     27.816 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   26.838 ms ±  1.716 ms  ┊ GC (mean ± σ):  2.25% ± 3.49%

  ▁█                                    ▅    ▄              ▅
  ██▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▄█▆▅▁▁▄▁▁▁▁▁▁▁▁▄█ ▄
  25 ms        Histogram: log(frequency) by time      29.3 ms <

 Memory estimate: 22.89 MiB, allocs estimate: 2.

julia> @benchmark run_julia(2000,3000)
BenchmarkTools.Trial: 19 samples with 1 evaluation.
 Range (min … max):  273.904 ms … 285.103 ms  ┊ GC (min … max): 1.36% … 0.26%
 Time  (median):     276.711 ms               ┊ GC (median):    1.35%
 Time  (mean ± σ):   277.268 ms ±   2.543 ms  ┊ GC (mean ± σ):  1.30% ± 0.25%

           ▃ ▃█     █ ▃
  ▇▁▁▁▁▁▁▁▁█▇██▁▁▇▁▇█▇█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  274 ms           Histogram: frequency by time          285 ms <

 Memory estimate: 68.66 MiB, allocs estimate: 4.

This is single threaded. Code:

julia> macro vp(expr)
            nodes = (Symbol("llvm.loop.vectorize.predicate.enable"), 1)
            if expr.head != :for
                error("Syntax error: loopinfo needs a for loop")
            end
            push!(expr.args[2].args, Expr(:loopinfo, nodes))
            return esc(expr)
       end

julia> function run4(height, width)
           y = range(-1.0f0, 0.0f0; length = height)
           x = range(-1.5f0, 0.0f0; length = width)
           fractal = fill(Int32(20), height, width)
           @inbounds @fastmath for w in 1:width
               @vp for h in 1:height
                   z_re = _c_re = x[w]
                   z_im = _c_im = y[h]
                   m = true
                   Base.Cartesian.@nexprs 20 i -> begin
                       z_re,z_im = _c_re + z_re*z_re - z_im*z_im, _c_im + 2*z_re*z_im
                       az4 = (z_re*z_re + z_im*z_im) > 4f0
                       fractal[h, w] = ifelse(m & az4,i%Int32,fractal[h,w])
                       m &= (!az4)
                   end
               end
           end
           return fractal
       end
This also works quite well with awkward combinations w/ respect to vector length
julia> @benchmark run_julia(10,10)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.316 μs … 189.292 μs  ┊ GC (min … max): 0.00% … 95.85%
 Time  (median):     3.414 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.580 μs ±   2.479 μs  ┊ GC (mean ± σ):  0.95% ±  1.36%

   ▅ ▆█                ▁
  ▅█▆██▇▄▃▂▂▂▂▂▂▄▆▃▃▂▃▇█▆▄▃▃▃▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  3.32 μs         Histogram: frequency by time        4.46 μs <

 Memory estimate: 1.36 KiB, allocs estimate: 2.

julia> @benchmark run4(10,10)
BenchmarkTools.Trial: 10000 samples with 57 evaluations.
 Range (min … max):  896.930 ns …  23.460 μs  ┊ GC (min … max): 0.00% … 92.93%
 Time  (median):     905.561 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   936.896 ns ± 423.935 ns  ┊ GC (mean ± σ):  0.97% ±  2.06%

  ▅██▅▄▃   ▁▃▃▁                ▃▃▁ ▂▄▂▁▁▁                       ▂
  ████████▇█████▆▅▅▃▃▃▃▁▃▄▁▆█▆▄█████████████▇▇▅▆▄▅▅▅▃▄▃▄▅▅▄▆▆▇▇ █
  897 ns        Histogram: log(frequency) by time       1.12 μs <

 Memory estimate: 496 bytes, allocs estimate: 1.

julia> @benchmark run_julia(16,16)
BenchmarkTools.Trial: 10000 samples with 3 evaluations.
 Range (min … max):  8.377 μs … 960.083 μs  ┊ GC (min … max): 0.00% … 93.93%
 Time  (median):     8.663 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.933 μs ±   9.612 μs  ┊ GC (mean ± σ):  1.01% ±  0.94%

     █▃   ▁▁
  ▁▂████▅▇██▆▄▃▂▂▂▂▂▄▆▇▆▆▅▄▄▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▂▁▁▁▁▁▁▁▂▂▂▂▂▂▂▁▁▁ ▂
  8.38 μs         Histogram: frequency by time        9.96 μs <

 Memory estimate: 3.19 KiB, allocs estimate: 2.

julia> @benchmark run4(16,16)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.236 μs … 147.251 μs  ┊ GC (min … max): 0.00% … 97.03%
 Time  (median):     1.306 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.438 μs ±   1.935 μs  ┊ GC (mean ± σ):  1.86% ±  1.38%

     ▅█▆▃▂▂▁      ▅▄▁    ▅▆▃▂   ▃▃▂▁                          ▂
  ▄▁▁███████▇▅▄▁▁▁███▆▇▅▇███████████████▇▇▆▆▆▆▆▅▆▅▆▆▆▆▅▅▄▅▃▄▅ █
  1.24 μs      Histogram: log(frequency) by time         2 μs <

 Memory estimate: 1.06 KiB, allocs estimate: 1.
8 Likes