How I can change script for speed up

Hi Experts ,I’m just start studing Julia. But I’m in truble.
Berrow 2D turblence simulation code is that I translated from Matlab. But It is not working well, very slow and use a lot of memory so I couldn’t run 3D version.
I would like to borrow your help.

using FFTW
using Printf
using LinearAlgebra
using Plots
using MAT

set parameters

N = 128 ; # N*N grid,
dt = 0.02 ; # timestep
simutimeSeconds = 10 ;

HowOftenSave = 10 ; # save velocity field per ‘HowOftenSave’ timesteps
HowOftenVisu = 10 ; # show illustration per ‘HowOftenVisu’ timesteps
simutimeSteps = round(simutimeSeconds/dt);

nu = 1/1000 ; # kinematic viscosity [nu] = m^2/s
L = 2*pi ; # domain side length
dx = L/N ; # grid spacing assumed dx=dy

Umax = 1 ;
F_0 = 0.03 ; ;

create fields

x = (L-dx)(0:(N-1))/(N-1) ; # to be precise, dx is subtracted here
y = (L-dx)
(0:(N-1))/(N-1) ; # otherwise the derivative of e.g. sin(x) wouldnt be differentiable on this periodic grid
X=zeros(N,N);
Y=zeros(N,N);
for jx=1:N
for ix=1:N
X[ix,jx]=x[jx]
Y[ix,jx]=y[ix]
end
end

wavenumbers

kx1 = mod.(1/2 .+ (0:(N-1))/N, 1) .- 1/2;
ky1 = mod.(1/2 .+ (0:(N-1))/N, 1) .- 1/2;
kx = kx1*(2pi/dx); # wavenumbers
ky = ky1
(2*pi/dx); # wavenumbers

KX=zeros(N,N);
KY=zeros(N,N);
for jkx=1:N
for ikx=1:N
KX[ikx,jkx]=kx[jkx]
KY[ikx,jkx]=ky[ikx]
end
end

Anti-aliasing filter based on the 2/3*kNyq rule

AA = (abs.(KX) .< (2/3) * maximum(kx)) .* (abs.(KY) .< (2/3) * maximum(ky))

Runge-Kutta 4 coefficients

a=[1/6 1/3 1/3 1/6]; b=[1/2 1/2 1 1];

#-- Taylor-Green --#
kmax = 4;
U = -Umax .* cos.(kmax .* X) .* sin.(kmax .* Y)
V = Umax .* sin.(kmax .* X) .* cos.(kmax .* Y)
U = U + randn(N,N)*F_0;
V = V + randn(N,N)*F_0;

Uhat = fft(U)
Vhat = fft(V)

Uhat = Uhat - (KX .* Uhat + KY .* Vhat) .* KX ./ (KX .^ 2 + KY .^ 2)
Uhat[isnan.(Uhat)] .= 0

Vhat = Vhat - (KX .* Uhat + KY .* Vhat) .* KY ./ (KX .^ 2 + KY .^ 2)
Vhat[isnan.(Vhat)] .= 0

matwrite(“uv0000.mat”, Dict(“U” => U, “V” => V));

Navier-Stokes simulation begins

@elapsed begin
for t=1:simutimeSteps
@printf(“| Time step # %0.4d |\n”,t)

#SolveNavierStokes2D;
Uold = Uhat; 
Vold = Vhat;
Uc   = Uhat; 
Vc   = Vhat;

for rk in 1:4
    U = real.(ifft(AA .* Uc))
    V = real.(ifft(AA .* Vc))

    dUdx = real.(ifft(1im * KX .* AA .* Uc));
    dUdy = real.(ifft(1im * KY .* AA .* Uc));
    dVdx = real.(ifft(1im  *KX .* AA .* Vc));
    dVdy = real.(ifft(1im * KY .* AA .* Vc));

    dUconv = fft(dt*(-U .* dUdx -V .* dUdy));
    dVconv = fft(dt*(-U .* dVdx -V .* dVdy));

    dUdiff = nu * dt * (-KX .* KX .* Uc - KY .* KY .* Uc);
    dVdiff = nu * dt * (-KX .* KX .* Vc - KY .* KY .* Vc);

    dU = dUconv + dUdiff;
    dV = dVconv + dVdiff;

    Uhat = Uhat-(KX.*Uhat + KY.*Vhat).*KX./(KX.^2 + KY.^2); Uhat[isnan.(Uhat)] .= 0;
    Vhat = Vhat-(KX.*Uhat + KY.*Vhat).*KY./(KX.^2 + KY.^2); Vhat[isnan.(Vhat)] .= 0;

    if rk < 4
        Uhat = Uold + b[rk] * dU
        Vhat = Vold + b[rk] * dV
    end        

    Uc = Uc + a[rk] * dU
    Vc = Vc + a[rk] * dV


end

Uhat = Uc
Vhat = Vc

Uhat = Uhat - (KX .* Uhat + KY .* Vhat) .* KX ./ (KX .^ 2 + KY .^ 2)
Uhat[isnan.(Uhat)] .= 0

Vhat = Vhat - (KX .* Uhat + KY .* Vhat) .* KY ./ (KX .^ 2 + KY .^ 2)
Vhat[isnan.(Vhat)] .= 0

U = real(ifft(Uhat));  
V = real(ifft(Vhat)); 


# SaveInstantField2D
if mod(t, HowOftenSave) == 0
filename = "uv$(@sprintf("%04d", t)).mat"
matwrite(filename, Dict("U" => U, "V" => V))
end

end
end

It appears that your code is in global scope which is very slow. Put your performance critical code inside a function as recommended in the Performance tips section of the manual.

Appart from that please format your code by putting it inside a code block

```julia
code
```

7 Likes

don’t roll your own rk4. use an integrated from DifferentialEquations.jl

5 Likes

What can I do DifferentialEquations.jl? What is difference own rk4 and DifferentialEquations.jl rk4. If you know something about Navier-Stokes quation with DifferentialEquations.jl,Please teach me about them.

the big difference is that by using a library it Will be a single line of code to switch to a good algorithm (e.g. tsit5).

I don’t understand your first question.

2 Likes