I’m writing some text explaining the importance of having type-stable code in Julia. I made the simple example of global variables:
y = 2.4
function bar(x)
res = zero(x)
for i in 1:1000
res += y * x
end
return res
end
and the benchmark returned
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 18.333 μs … 22.394 ms ┊ GC (min … max): 0.00% … 99.79%
Time (median): 28.462 μs ┊ GC (median): 0.00%
Time (mean ± σ): 28.535 μs ± 224.046 μs ┊ GC (mean ± σ): 8.26% ± 2.05%
▂█▇▅▃ ▄▅▅▅▂
▂▂▃▃▂▃▄██████▄▃▂▂▂▂▂▂▂▂▂▂▂▁▂▂▁▁▁▁▁▁▁▂▂▂▃▄▆██████▅▃▃▂▂▂▂▂▂▂▂▂ ▃
18.3 μs Histogram: frequency by time 34 μs <
Memory estimate: 46.88 KiB, allocs estimate: 3000.
While for the type stable case (using const
)
const z = 2.4
function bar(x)
res = zero(x)
for i in 1:1000
res += z * x
end
return res
end
And the benchmark
BenchmarkTools.Trial: 10000 samples with 235 evaluations.
Range (min … max): 317.043 ns … 408.153 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 318.221 ns ┊ GC (median): 0.00%
Time (mean ± σ): 319.706 ns ± 5.002 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▂▁█▅ ▃▂▁ ▂▁ ▂ ▁
█████▅▁▃▃▄▃▃▆▇█████▇▇▆▆▆▆▅▅▄▁▅▃▄▄▄▃▄▁▄▄▄▁▃▃▅▄▄▄▅▃▃▁▄▄▄▅▅▅██▇█ █
317 ns Histogram: log(frequency) by time 337 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
Which is a good result.
Now, I want to extend the demonstration with a more complicated function involving structs and LinearAlgebra operations on them. But I noticed that the difference in performance decreased a lot so the two approaches are basically the same. Here I show a very simple example where I simply invert a global matrix.
test_bad = rand(100, 100);
function test_inv_bad()
return inv(test_bad)
end
@benchmark test_inv_bad()
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 172.951 μs … 12.548 ms ┊ GC (min … max): 0.00% … 98.00%
Time (median): 244.284 μs ┊ GC (median): 0.00%
Time (mean ± σ): 259.085 μs ± 147.885 μs ┊ GC (mean ± σ): 1.13% ± 2.99%
▆▂ ▁█▃
▃▄▂▂▂▁▁▁▁▁▁▁▁▂▂▃██▅▃▂▂▂▂▄▇▅▇▄▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃███▄▂▂▂▂▂▃▂▂▂▂▂▂ ▃
173 μs Histogram: frequency by time 330 μs <
Memory estimate: 129.17 KiB, allocs estimate: 5.
and
const test = rand(100, 100);
function test_inv()
return inv(test)
end
@benchmark test_inv()
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 198.517 μs … 12.616 ms ┊ GC (min … max): 0.00% … 97.69%
Time (median): 213.271 μs ┊ GC (median): 0.00%
Time (mean ± σ): 228.217 μs ± 142.781 μs ┊ GC (mean ± σ): 1.12% ± 2.82%
▅█▃
▁▁▁▂▂▂▁▄███▇▄▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▂▃▃▂▂▃▃▂▂▁▁▁▁▁▁▁ ▂
199 μs Histogram: frequency by time 286 μs <
Memory estimate: 129.17 KiB, allocs estimate: 5.
I guess that there is a runtime dispatch before calling inv
, such that the compiler knows the type of the matrix to invert, and so it can compile the code from that point onwards.
But then, how to show the loss of performance using LinearAlgebra routines when using type-unstable functions?