# Finite difference stencils in 2D with mixed orders

Apologies if this topic has been covered or if there is a good guide for this (I have been looking for a while, including at source code for more advanced packages but I can’t wrap my head around if/how they do this sort of thing). I’m not “new” to using Julia (been using it for a few years), but I am new to using Julia “properly”, i.e. writing good code instead of just solving small problems.

I am writing a finite difference code for 2D problems, ideally that other people would be able to use. At the moment I have a loop that computes the second order FD stencil in 2D by looping over the internal nodes:

``````function SecondDerivative!(uₓₓ::Array,u::Array,Δx::Real,Δy::Real,Index::CartesianIndices)

offx = CartesianIndex(1,0)
offy = CartesianIndex(0,1)

for I in Index
uₓₓ[I] =
(u[I-offx] + 2*u[I] + u[I+offx])/(2.0Δx^2) +
(u[I-offy] + 2*u[I] + u[I+offy])/(2.0Δy^2)
end
uₓₓ
end
``````

The way I’ve written it so far is that when the solver is called, I generate a new function,

``````D_xx(cache,U) = SecondDerivative!(cache,U,dx,dy,Index)
``````

for calling inside the time loop (since the grid size and number of nodes doesn’t change). But I won’t always want to compute a 5 point stencil. What if I want to be able to use a 2nd order in x and a 4th order in y? For instance I might have these two functions that I want to combine:

``````function SecondDerivative_Order2(u,Δx)
return (u[j-1] - 2u[j] + u[j+1])/Δx^2
end
function SecondDerivative_Order4(u,Δx)
(1/12*u[j-2] + -4/3*u[j-1] + 5/2*u[j] - 4/3u[j+1] + 1/12*u[j+2])/Δx^2
end
``````

This seems like a job for metaprogramming but I can’t work out how this actually works to combine code snippets.

You can call the desired functions for a specific slice of the array, e.g.:

``````function D_xx(cache, U)
for iy=1:Ny
@views SecondDerivative_Order2(cache[:,iy], dx)
end
for ix=1:Nx
@views SecondDerivative_Order4(cache[ix,:], dy)
end
return nothing
end
``````

The general approach, I guess, would be with pre and pos CartesianIndices:

``````daxis = 2    # axis to calculate the derivative

s = size(u)
Ipre = CartesianIndices(s[1:daxis-1])
Ipos = CartesianIndices(s[daxis+1:end])

for j=2:s[daxis]-1
u[Ipre,j,Ipos] = (u[Ipre,j-1,Ipos] - 2u[Ipre,j,Ipos] + u[Ipre,j+1,Ipos]) / dx^2
end
``````

Thanks for the reply. Wouldn’t the `D_xx` function be slow given it loops over the array twice rather than in a single loop? Or is this something the compiler will deal with?

A previous approach of mine was to call the stencils in a for loop (although this was before I was creating new functions with `dx` already defined), and the result was much slower than looping over the stencil.

In your original code the loop goes over 2D CartesianIndex which corresponds to a nested loop over first and second axis. Therefore, the question is about the speed of a single nested loop versus two separate loops along each of the axes. The number of operations seems the same. Though, some slowdown can be due to sub-optimal memory access.
You can always compare the speed experimentally using BenchmarkTools package.

I tested (and retested in 1 or 2 cases) a number of different methods only using the 2nd order 5 point stencil below is a snippet of a few of them. Unless I’m calling `@benchmark` incorrectly, then by far the fastest method is the original, double nested loop with the 5 point stencil, which at least with the implementation below seems to be about 5x faster with a mean time of 4.012μs versus the second method with a mean time of 25.544μs. Having said that I’m sure I’ve probably written things sub-optimally.

Thanks for your help so far @fedoroff .

``````using BenchmarkTools

n = 41
D = [0.0,1.0]
dx = D/(n-1)
grid = range(0.0,1.0,length=n)

u = [x^2 for x in grid, y in grid]

#===== 2D STENCIL ======#
function SecondDerivative!(uₓₓ::Array,u::Array,Δx::Real,Δy::Real,Index::CartesianIndices)

offx = CartesianIndex(1,0)
offy = CartesianIndex(0,1)

for I in Index
uₓₓ[I-offx-offy] =
(u[I-offx] + 2*u[I] + u[I+offx])/(2.0Δx^2) +
(u[I-offy] + 2*u[I] + u[I+offy])/(2.0Δy^2)
end
return nothing
end

Index = CartesianIndices(u)

InternalIndex = CartesianIndices((Index.indices[2:end-1],Index.indices[2:end-1]))

FixedSecondDerivative(uxx,u) = SecondDerivative!(uxx,u,dx,dx,InternalIndex)

#===== LOOP PER DIRECTION ======#
function SecondDerivative_Order2(cache,u,Δx)
cache .= (u[1:end-2] .- 2u[2:end-1] .+ u[3:end])./Δx^2
end

FixedSecondDerivative_Order2(cache,u) = SecondDerivative_Order2(cache,u,dx)

function D_xx(cache, U,Nx,Ny)
for iy=1:Ny-2
@views FixedSecondDerivative_Order2(cache[:,iy],U[:,iy+1])
end
for ix=1:Nx-2
@views FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])
end
return nothing
end

FixedD_xx(uxx,u) = D_xx(uxx,u,n,n)

#===== PER DIRECTION WITHOUT CALLING FUNCTIONS ======#
function D_xx_loops(cache, U,Nx,Ny,dx,dy)
for iy=1:Ny-2
cache[:,iy] .=  (u[1:end-2,iy] .- 2u[2:end-1,iy] .+ u[3:end,iy])./dx^2
end
for ix=1:Nx-2
cache[ix,:] .+=  (u[ix,1:end-2] .- 2u[ix,2:end-1] .+ u[ix,3:end])./dy^2
FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])
end
return nothing
end

FixedD_xx_loops(uxx,u) = D_xx_loops(uxx,u,n,n,dx,dx)

#===== FUNCTION PER NODE ======#
function SecondDerivativeSingle(u,dx)
return (u - 2u + u)/dx^2
end

FixedSecondDerivativeSingle(u) = SecondDerivativeSingle(u,dx)

function D_xx_stencilloop(cache,U,Nx,Ny,stencil1,stencil2)
for i = 1:Nx-2
for j = 1:Nx-2
cache[i,j] = stencil1(u[i,j:j+2]) + stencil2(u[i:i+2,j])
end
end
end

Fixed_D_xx_stencilloop(uxx,u) = D_xx_stencilloop(uxx,u,n,n,FixedSecondDerivativeSingle,FixedSecondDerivativeSingle)

#= Allocate solution vectors n-2 for internal nodes only =#
uxx1 = zeros(n-2,n-2);
uxx2 = zeros(n-2,n-2);
uxx3 = zeros(n-2,n-2);
uxx4 = zeros(n-2,n-2);

#= Benchmarking =#
@benchmark FixedSecondDerivative(uxx1,u)

@benchmark FixedD_xx(uxx2,u)

@benchmark FixedD_xx_loops(uxx3,u)

@benchmark Fixed_D_xx_stencilloop(uxx4,u)
``````

If you run your tests in global scope, your need to interpolate the variables to avoid measuring the compilation time, e.g. (see the `\$` symbols),

``````@benchmark FixedSecondDerivative(\$uxx1, \$u)
``````

Also, to avoid creation of temporary arrays, when you take the slice of an array, it useful to take its view instead, e.g.

``````cache[i,j] = @views stencil1(u[i,j:j+2]) + stencil2(u[i:i+2,j])
``````
``````@views FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])
``````

It will greatly increase the speed.

1 Like