Functions of Functions Performance in a Loop

For the following two loops, in which a function of a function is called in one, and the computation is more explicit in the other, there is a significant amount of memory allocation in the former. It also takes a performance hit of a factor of about 5:

function Boltzmann(x,V)
    w = exp(-V(x));
    return w;
end

function test_loop1(n,V)
    avg = 0.
    for j in 1:n
        x = randn();
        L = Boltzmann(x,V);
        b = max(1, L)
        avg +=b/n;
    end
    
    return avg;
end

function test_loop2(n,V)
    avg = 0.
    for j in 1:n
        x = randn();
        L = exp(-V(x));
        b = max(1, L)
        avg +=b/n;
    end
    
    return avg;
end


function U(x)
   return 0.5 * x * x; 
end
Random.seed!(100)
@btime test_loop1(10^4,U)
  1.525 ms (60000 allocations: 937.50 KiB)

while

Random.seed!(100)
@btime test_loop2(10^4,U)
  278.115 μs (0 allocations: 0 bytes)

This is a fairly trivial example, and it’s not terrible that I would need to code out exp(-V(x)). There are, however, other problems where an intermediate computation is needed that is messier, and it would be nice to be able to encapsulate it in a function without taking such a hit. How should I understand this behavior? How can I improve upon it?

2 Likes

I would try

function test_loop3(n,V::F) where {F}
    avg = 0.
    for j in 1:n
        x = randn();
        L = Boltzmann(x,V);
        b = max(1, L)
        avg +=b/n;
    end
    
    return avg;
end

The compiler cannot always specialize on function arguments. AFAICT the heuristic is if the function is called in the body. With the above trick you force specialization anyway.

5 Likes

See Performance Tips · The Julia Language for the doc entry about this.

8 Likes