Deepcopy against the grain of Julia?

idontgetoutmuch · November 16, 2021, 11:51am

I have implemented a particle filter for an agent-based model using JuliaDynamics but I seem to have ended up using deepcopy a lot. Also when I run the simple particle filter over a very simple model I don’t get very good performance

julia> @time runPf(inits, init_log_weights);
 56.329763 seconds (28.13 M allocations: 3.170 GiB, 1.81% gc time, 0.07% compilation time)

I feel I am “working against the grain” of Julia. Here’s my code: https://gist.github.com/idontgetoutmuch/54b2482efea120566290b594507679ec. How can I make it more Julia-esque? And how can I improve performance?

PS I realise I can parallelize the particle filter but I don’t want to do this until I am working with the grain of Julia

PPS I am a functional programmer by background and keep forgetting Julia does call by reference.

lmiq · November 16, 2021, 12:21pm

The first thing you should do is to remove all global variables. For instance, here actuals, Q and R are globals:

function runPf(inits, init_log_weights)
    for i in 2:l
        (end_states2, logWeights2, inits2, a2) = pf(deepcopy(inits), init_log_weights, P, map(x -> convert(Float64,x), actuals[i]), Q, R);
        predicted[i] = mean(end_states2);
        inits = inits2;
        init_log_weights = logWeights2;
    end
end

(or at least declare them as const).

goerch · November 16, 2021, 12:32pm

Looking at a profile of your program it seems 80 % of runtime is spent at

        Agents.step!(jnits[i], agent_step!, 1)

Maybe you should isolate a M(inimal;)WE around that to get the discussion going.

Tamas_Papp · November 16, 2021, 12:42pm

Not really, functions pass around values, which of course may be containers.

Keep in mind that Julia does not have pointers in the C/C++/… sense.

idontgetoutmuch · November 16, 2021, 3:23pm

Thank you - Q and R are now parameters but I see no improvement in performance.

lmiq · November 16, 2021, 3:50pm

And actuals? The point is that while you have global variables and type instabilities, speculation about other sources of cost is a waste of time.

idontgetoutmuch · November 16, 2021, 4:24pm

I think I have done that but with no improvement. But perhaps I have erred: https://gist.github.com/idontgetoutmuch/54b2482efea120566290b594507679ec (updated).

Datseris · November 16, 2021, 4:32pm

You should do a basic profiling first. Help us help you by letting us know which lines take the most time. (also, try to make your code as small as possible, it increases likelyhood that someone else will read it )

lmiq · November 16, 2021, 4:32pm

Ok, now that looks ok on that first call.

From my side, what prevents me to running your code is that as it is I would need to install many packages which I do not want to. You don’t seem to be using all of them in that code. The simplest the example gets, the better the chances of someone actually working on it to show where it can be improved.

In general, after type instabilities are removed, you should chase allocations. Any copy of something mutable of course allocates. But there are other places where you seem to be allocating new arrays. For example, here:

    wn = map(x -> exp(x), log_w .- maximum(log_w))
    wn = wn / sum(wn)

that is allocating 3 arrays (one in og_w .- maximum(log_w)), one in the output of the map, and a third in the normalization:

julia> function f(log_w)
          wn = map(x -> exp(x), log_w .- maximum(log_w));
           wn = wn / sum(wn);
           return wn
       end   
f (generic function with 1 method)

julia> x = rand(10);

julia> @btime f($x);
  147.233 ns (3 allocations: 480 bytes)

If you rewrite that with some care:

julia> function f(log_w)
          wn = exp.(log_w .- maximum(log_w));
           swn = sum(wn)
           wn .= wn ./ swn;
           return wn
       end
f (generic function with 1 method)

julia> @btime f($x);
  97.897 ns (1 allocation: 160 bytes)

you can reduce that to one allocation and accelerate it quite a bit. This is one example. But profiling is important to know where to spend time optimizing. But only after you removed the type instabilities. otherwise the time spent on each function may not be representative of anything related to the algorithms used.

idontgetoutmuch · November 16, 2021, 4:41pm

Thank you very much for taking the time - I am going through and removing unneeded imports. I hope to return with an easier-to-run example.

Topic		Replies	Views
Applying performance tips in library Performance	0	318	July 28, 2021
Help with optimizing a particle filter in SciML Modelling & Simulations	8	326	March 16, 2023
Code parallelization for Brownian Dynamics simulation of hard spheres Performance parallel	5	1083	August 13, 2021
Parallel Computations Julia at Scale	6	1018	February 22, 2019
Monte Carlo particle simulations are very slow Performance	9	975	May 11, 2020

Deepcopy against the grain of Julia?

Related topics