# How can i improve the performance of the code?

I’m a newbie of julia, and I want to improve the performance of the code included massive iteration and calculation.

I
Even I red the manual, but I have no ability to adapt all of the solutions…

Here is simplified code.

``````using DelimitedFiles
using Plots

global main=reshape(main,150,27,55)

function Iteration()
#Temporary parameter
Pres=760; T1=500; T2=500; T3=5000;
experiment=ifelse.(isnan.(experiment), 0, experiment)
main_raw=experiment[:,1]; I_raw=experiment[:,2];
main_start=findmin(main_raw); main_end=findmax(main_raw);
main_start=main_start[1]; main_end=main_end[1]
if main_start<300; main_start=300; end
if main_end>1200; main_end=1200; end
main_div=trunc(Int,(100*(main_end-main_start)))

#call parameters
a1=para_A2C[:,1]; Ya1=para_A2C[:,2]; Ba1=para_A2C[:,3]; Da1=para_A2C[:,4]; wa1=para_A2C[:,5]; wb1=para_A2C[:,6]; wc1=para_A2C[:,7]; Ea1=para_A2C[:,8]; JG1=para_A2C[:,9]; DAN1=para_A2C[:,10];
a2=para_A2B[:,1]; Ya2=para_A2B[:,2]; Ba2=para_A2B[:,3]; Da2=para_A2B[:,4]; wa2=para_A2B[:,5]; wb2=para_A2B[:,6]; wc2=para_A2B[:,7]; Ea2=para_A2B[:,8]; JG2=para_A2B[:,9]; DAN2=para_A2B[:,10];

main_div=1000; #temporary
#### sub-coefficient calculation ####
Rv=Array{Float64,2}(undef,5,11);
Gv=Array{Float64,2}(undef,5,11);
I_sharp=Array{Float64,2}(undef,150,27);

for A_2C=1:1:5;
for A_2B=1:1:11;
Rv[A_2C,A_2B]=(wa1[A_2C]*(a1[A_2C]+0.5)-wb1[A_2C]*(a1[A_2C]+0.5)^2+wc1[A_2C]*(a1[A_2C]+0.5)^3)-(wa2[A_2B]*(a2[A_2B]+0.5)-wb2[A_2B]*(a2[A_2B]+0.5)^2+wc2[A_2B]*(a2[A_2B]+0.5)^3);
Gv[A_2C,A_2B]=wa1[A_2C]*(a1[A_2C]+0.5)-wb1[A_2C]*(a1[A_2C]+0.5)^2+((a1[A_2C]+0.5)^3);
end
end

main_start=390; main_end=400; main_div=1000; #temporary
main_shift=zeros(55); #temporary
main_cal=LinRange(main_start,main_end,main_div); k=1:1:main_div

# 'crazy bottle neck'!!!
for A_2C=1:1:5  #A_2C 5
for A_2B=1:1:5 #A_2B 11
numb_v=(A_2B+11*(A_2C-1))
for branch=1:1:27
for J1=1:1:150 # 150
I_sharp[J1,branch]= Re[A_2C,A_2B]^2+*(1/(main[J1,branch,numb_v].-main_shift[numb_v])^4)*CAR[A_2C,A_2B]*MainCar[J1*exp(-1/T2)
end
end
end
end
end
``````

In particular, most of the time is spent in the section where the ‘crazy bottle neck’ i denoted. (4 times for)

First thing is NOT to use global variables. Define them as const or - better - pass them as variables to your function.

1 Like

As @ufechner7 mentioned, the main issue with this code is use of the global variable `main`. Easiest way to fix this is to pass it in as an argument to the function.

``````function Iteration(main)
end
``````

And then call it

``````main=readdlm("main.txt");
main=reshape(main,150,27,55)
Iteration(main)
``````

You can add an `@inbounds` to the outermost `for` loop to turn of bounds checking to speed up the code. I would only recommend this once you are sure the code works as it won’t error if you do something unsafe.

There are packages like `LoopVectorization.jl` which provide macros to try and speed up loops too that might be worth checking out.

There are likely other issues but the biggest problem is the global variable.

The reason global variables are bad for performance is that compilation is done on a function basis. When your `Iteration` function is compiled, the compiler assumes the global variable `main` can change its type at any time. So in your case, every time `main` is accessed in the inner loop, a check is done to find the type of the variable, and the right way to index it is looked up. This kills performance very efficiently in your otherwise quite clean loop.

I included the variables ‘main’ in the function Iteration(), before. And compare to before, it (the code I uploaded) improve the speed about 20 %.
Could you give me an example?

Can you share the file main.txt ? Then we could play a bit with your code to try to reproduce your issue.

Sorry to late reply. I could not find the way to upload the text file. however, it would be not different with rand(222750) in point of programming view. Over 5 million allocation is occuring in my code, and 4 millions are occured in ‘Crzay bottle neck’.
In addition, I have another question. Idealized code shows low allocation (close to 0)?

Zero allocations are usually hard to achieve and only in rare cases worth the effort, but reducing allocations by a factor of 5 to 10 is often not too hard by pre-allocating arrays and using views, in-place operators and in-place functions.

Did you try to compress it first and than upload it to gist: https://gist.github.com ?
Docu: Creating gists - GitHub Docs

I should pre-allocating array and using view, maybe.