Improving runtime

Moslem_Uddin · July 9, 2023, 4:58pm

I’m a new user of JULIA. The following code is a line by translation from MATLAB with the aim of comparing runtime. However, it seems like the performance doesn’t improve. I’ll be happy to have feedback on why it’s the case.

https://github.com/muddin21/RegularizedStokeslet/blob/main/Regularized%20Stokeslet%20in%20JULIA.ipynb

gdalle · July 9, 2023, 5:09pm

At first glance, it seems your code performs a lot of unneeded allocations. For Julia to be fast, you want to avoid allocating new memory when you can reuse it instead.
See the performance tips for more details, especially the sections about

In many cases it will be as easy as replacing

for i in 1:n
    x = y - z
end

with

x =  # init
for i in 1:n
    x .= y .- z
end

whenever you deal with arrays

gdalle · July 9, 2023, 5:14pm

You can diagnose the performance of your your code with the @btime macro from BenchmarkTools.jl. Ideally, you want the number of allocations not to scale with the number of loop iterations. In practice, that’s a lot to ask, so you should only focus on the most critical parts.
How do I recognize these critical parts, you ask? By profiling your code, eg. using the @profview macro from the VSCode Julia extension.

ufechner7 · July 9, 2023, 5:18pm

Which function do you want to optimize?

fnin · July 11, 2023, 1:59pm

As I understand it the main work is being done in the loop at the end of the notebook.

The loop is using non constant global variables. If you want Juila to be able to compile efficient code for the last part you should consider putting it in a function.

Moslem_Uddin · July 11, 2023, 2:52pm

I would like to improve the overall performance of the code.

ufechner7 · July 11, 2023, 7:38pm

Well, this is nothing we can help you with. We can help you with improving the performance of a function, if you point out which function should be faster.

gdalle · July 11, 2023, 10:14pm

The advice I gave here should be a good starting point for you! Feel free to come back if there are things you don’t understand

algunion · July 12, 2023, 12:48am

One important thing I noticed is that you allocate an enormous number of 1-length vectors.

Now, this might make sense in Matlab (where everything is an array), but in Julia, x=1 is very different from x=[1] in terms of memory allocation.

A small change from X_s_distance2 = (X_s[1] .^ 2 .+ X_s[2] .^ 2) to X_s_distance2 = (X_s[1] ^ 2 + X_s[2] ^ 2) alone is dropping the @btime results from 18 seconds to almost 16 seconds on my machine (yes, this X_s_distance2 is one of your one-element vectors).

After doing this for a few more one-element vectors, I managed to reach 12 seconds - and I could continue doing this - but I think you got the point.

So, I don’t think your code is a line-by-line translation to Julia - especially because you are forcing this everything is an array philosophy on Julia.

You might also be interested in reading Noteworthy Differences from other Languages · The Julia Language.

Equally important, pay attention to the advice related to performance that others have already pointed out. For example, in your velocityRS function, you allocate u = zeros(2, length(s)) and int_u = zeros(size(u)) each time when you call the function (and you end up calling it from a for loop). On my machine, preallocating u and int_u is followed by another almost 2 seconds execution time reduction. And there is still room to fix many things (e.g., an enormous number of allocations remain).

Have fun.

Moslem_Uddin · July 14, 2023, 1:05am

Thank you for your comment. Even without pre-allocation, the runtime seems to be improved. Could you please show an example of the pre-allocation of one variable from the velocityRS function? I’m a little confused about this.

algunion · July 14, 2023, 10:55am

Please take a look here - this was already posted by @gdalle before.

This is one of the specific performance tips that you can apply to your valocityRS function: you allocate your output u each time you call the function inside your loop. Imagine the alternative where you preallocate u before your outer loop and then pass u to your velocityRS where you can fill/mutate the array as needed.

I am not saying this particular step will significantly improve your performance (because I think the big chunk of allocations are happening all over the place because of those unneeded 1-length vectors).

Another important thing - if you know that you will only need a very small container and its size is known at the compile time, you could use tuples instead of vectors ((1,2) vs. [1,2]), especially in the scenario of your nested loops - that will make a big difference on memory allocation (and garbage collection) - consequently will improve the execution.

Topic		Replies	Views
Updating an array inside a for loop- comparison with MATLAB General Usage	17	3743	September 21, 2017
Why fewer memory allocations does not necessarily suggest higher speed New to Julia performance , memory-allocation	5	792	June 6, 2021
Performance degradation after upgrading from 0.5.1 to 0.6.2 -- how to avoid memory allocation? Performance	9	946	March 26, 2018
Optimizing Linear Algebra Code? Performance linearalgebra	12	1407	April 9, 2021
Porting code from MatLab - performance tips New to Julia	18	427	June 26, 2024

Improving runtime

Related topics