For the following two loops, in which a function of a function is called in one, and the computation is more explicit in the other, there is a significant amount of memory allocation in the former. It also takes a performance hit of a factor of about 5:
function Boltzmann(x,V)
w = exp(-V(x));
return w;
end
function test_loop1(n,V)
avg = 0.
for j in 1:n
x = randn();
L = Boltzmann(x,V);
b = max(1, L)
avg +=b/n;
end
return avg;
end
function test_loop2(n,V)
avg = 0.
for j in 1:n
x = randn();
L = exp(-V(x));
b = max(1, L)
avg +=b/n;
end
return avg;
end
function U(x)
return 0.5 * x * x;
end
Random.seed!(100)
@btime test_loop1(10^4,U)
1.525 ms (60000 allocations: 937.50 KiB)
while
Random.seed!(100)
@btime test_loop2(10^4,U)
278.115 μs (0 allocations: 0 bytes)
This is a fairly trivial example, and it’s not terrible that I would need to code out exp(-V(x))
. There are, however, other problems where an intermediate computation is needed that is messier, and it would be nice to be able to encapsulate it in a function without taking such a hit. How should I understand this behavior? How can I improve upon it?