Boltzmann Factor in a loop

cstjean · April 1, 2018, 7:56pm

… and types.

cortner · April 1, 2018, 10:49pm

Why - I thought that is not normally needed? My understanding here was that functions are a bit special because of this compiler optimisation? What am I missing?

cstjean · April 2, 2018, 12:14am

cortner · April 2, 2018, 12:17am

I see - thank you!

kristoffer.carlsson · April 2, 2018, 7:52am

I agree that it is a confusing right now. We typically say that function argument types is only used for dispatch and has no effect on performance. This is an exception to that.

cortner · April 2, 2018, 11:29am

It would be good to have clear documentation about this. I guess he issue linked above is about that?

gideonsimpson · April 3, 2018, 5:27pm

I wanted to revisit this topic with a similar piece of code. The following is a minimal example associated with a stochastic sampling method that uses the gradient of the potential, V(x), in its computation. I tried using the where command, and obtained rather disappointing results:

function run0(x0, gradV::gradVf, niterations) where {gradVf}

    X0 = similar(x0);
    gradV0 = similar(x0);
    @. X0 = x0; # copy over initial condition

    for j = 1:niterations
        gradV0 .= gradV(X0);
    end
end

function gradV(x)
    gradV = 4 * (dot(x,x) -1 ) * x;
    return gradV
end

niterations  = 10^6;

X0 = .5 * ones(2);

@time run0(X0, gradV, niterations);

After the second run, of this, I get:

julia> @time run0(X0, gradV, niterations);
  0.070732 seconds (1.00 M allocations: 91.553 MiB, 9.93% gc time)

And the memory usage linearly scales with the number of iterations. The only difference between this and the other example is that gradV is a \mathbb{R}^d\to \mathbb{R}^d function while V was a \mathbb{R}^d\to \mathbb{R} function.

kristoffer.carlsson · April 3, 2018, 5:42pm

But this is just because you are allocating a new vector inside the function in every iteration?

gideonsimpson · April 3, 2018, 5:49pm

Shouldn’t preallocating the gradV0 vector (gradV0 = similar(x0);) and then using the .= assignment operation avoid creating a new vector each time? If not, how would I code this to avoid the creation of a new vector at each iteration?

kristoffer.carlsson · April 3, 2018, 6:05pm

gradV(X0) allocates a vector which is copied to gradV0. You can send in gradV0 to the function and assign to it there.

gideonsimpson · April 3, 2018, 6:10pm

So the C-style solution of passing a pointer to the array that is being populated with new values is the recommended approach for this scenario?

kristoffer.carlsson · April 3, 2018, 6:26pm

Well, if you want to assign in place to a vector then you need to have a handle to that vector.

gideonsimpson · April 3, 2018, 6:40pm

Consider this version of the problem:

using ForwardDiff
function run1(x0, V::Vf, niterations) where {Vf}

    X0 = similar(x0);
    gradV0 = similar(x0);
    @. X0 = x0; 

    for j = 1:niterations
        ForwardDiff.gradient!(gradV0, V, X0);
    end
end

function V(x)
    @fastmath V = (dot(x,x)-1)^2;
    return V
end

niterations  = 10^4;

X0 = .5 * ones(2);

with

julia> @time run1(X0, V, niterations);
  0.059611 seconds (60.01 k allocations: 2.900 MiB, 8.54% gc time)

and when I increase the number of iterations:

julia> niterations  = 10^6;

julia> @time run1(X0, V, niterations);
  5.537944 seconds (6.00 M allocations: 289.917 MiB, 0.43% gc time)

Though perhaps this is a problem with ForwardDiff.

cortner · April 3, 2018, 6:42pm

Maybe try to write an in-place gradient gradV! that doesn’t allocate?

cortner · April 3, 2018, 6:42pm

Or use StaticArrays

kristoffer.carlsson · April 3, 2018, 6:49pm

http://www.juliadiff.org/ForwardDiff.jl/stable/user/api.html#Preallocating/Configuring-Work-Buffers-1

cortner · April 3, 2018, 6:51pm

was about to send the same link

gideonsimpson · April 3, 2018, 6:58pm

That doesn’t really seem to improve things. With

function run1(x0, V::Vf, niterations) where {Vf}

    X0 = similar(x0);
    gradV0 = similar(x0);
    @. X0 = x0;

    gradcfg = ForwardDiff.GradientConfig(V, X0);

    for j = 1:niterations
        ForwardDiff.gradient!(gradV0, V, X0,gradcfg);
    end
end

function V(x)
    @fastmath V = (dot(x,x)-1)^2;
    return V
end

niterations  = 10^6;

X0 = .5 * ones(2);

I get:

julia> @time run1(X0, V, niterations);
  0.477692 seconds (4.00 M allocations: 122.071 MiB, 1.46% gc time)

kristoffer.carlsson · April 3, 2018, 9:30pm

Removing the @slowmath macro makes the allocations go down to 1 per iteration.

To improve the run time slightly we can either create the GradientConfig with an explicit chunk size like:

gradcfg = ForwardDiff.GradientConfig(V, X0, ForwardDiff.Chunk{2}())

or put the creation of the config behind a function-barrier.

julia> @time run1(X0, V, niterations);
  0.300106 seconds (1.00 M allocations: 30.518 MiB, 0.93% gc time)

I’m not sure where the last allocation in each iteration comes from.

Using a StaticArray which doesn’t need any preallocation or configs:

julia> @time run1(rand(SVector{2}), V, niterations);
  0.008152 seconds (17 allocations: 928 bytes)