For the following two loops, in which a function of a function is called in one, and the computation is more explicit in the other, there is a significant amount of memory allocation in the former. It also takes a performance hit of a factor of about 5:
function Boltzmann(x,V) w = exp(-V(x)); return w; end function test_loop1(n,V) avg = 0. for j in 1:n x = randn(); L = Boltzmann(x,V); b = max(1, L) avg +=b/n; end return avg; end function test_loop2(n,V) avg = 0. for j in 1:n x = randn(); L = exp(-V(x)); b = max(1, L) avg +=b/n; end return avg; end function U(x) return 0.5 * x * x; end
Random.seed!(100) @btime test_loop1(10^4,U) 1.525 ms (60000 allocations: 937.50 KiB)
Random.seed!(100) @btime test_loop2(10^4,U) 278.115 μs (0 allocations: 0 bytes)
This is a fairly trivial example, and it’s not terrible that I would need to code out
exp(-V(x)). There are, however, other problems where an intermediate computation is needed that is messier, and it would be nice to be able to encapsulate it in a function without taking such a hit. How should I understand this behavior? How can I improve upon it?