Finite difference stencils in 2D with mixed orders

Apologies if this topic has been covered or if there is a good guide for this (I have been looking for a while, including at source code for more advanced packages but I can’t wrap my head around if/how they do this sort of thing). I’m not “new” to using Julia (been using it for a few years), but I am new to using Julia “properly”, i.e. writing good code instead of just solving small problems.

I am writing a finite difference code for 2D problems, ideally that other people would be able to use. At the moment I have a loop that computes the second order FD stencil in 2D by looping over the internal nodes:

function SecondDerivative!(uₓₓ::Array,u::Array,Δx::Real,Δy::Real,Index::CartesianIndices)

    offx = CartesianIndex(1,0)
    offy = CartesianIndex(0,1)

    for I in Index
    uₓₓ[I] = 
            (u[I-offx] + 2*u[I] + u[I+offx])/(2.0Δx^2) +
            (u[I-offy] + 2*u[I] + u[I+offy])/(2.0Δy^2)
    end
    uₓₓ
end

The way I’ve written it so far is that when the solver is called, I generate a new function,

D_xx(cache,U) = SecondDerivative!(cache,U,dx,dy,Index)

for calling inside the time loop (since the grid size and number of nodes doesn’t change). But I won’t always want to compute a 5 point stencil. What if I want to be able to use a 2nd order in x and a 4th order in y? For instance I might have these two functions that I want to combine:

function SecondDerivative_Order2(u,Δx)
   return (u[j-1] - 2u[j] + u[j+1])/Δx^2
end
function SecondDerivative_Order4(u,Δx)
  (1/12*u[j-2] + -4/3*u[j-1] + 5/2*u[j] - 4/3u[j+1] + 1/12*u[j+2])/Δx^2
end

This seems like a job for metaprogramming but I can’t work out how this actually works to combine code snippets.

You can call the desired functions for a specific slice of the array, e.g.:

function D_xx(cache, U)
    for iy=1:Ny
        @views SecondDerivative_Order2(cache[:,iy], dx)
    end
    for ix=1:Nx
        @views SecondDerivative_Order4(cache[ix,:], dy)
    end
    return nothing
end

The general approach, I guess, would be with pre and pos CartesianIndices:

daxis = 2    # axis to calculate the derivative

s = size(u)
Ipre = CartesianIndices(s[1:daxis-1])
Ipos = CartesianIndices(s[daxis+1:end])

for j=2:s[daxis]-1
    u[Ipre,j,Ipos] = (u[Ipre,j-1,Ipos] - 2u[Ipre,j,Ipos] + u[Ipre,j+1,Ipos]) / dx^2
end 

Thanks for the reply. Wouldn’t the D_xx function be slow given it loops over the array twice rather than in a single loop? Or is this something the compiler will deal with?

A previous approach of mine was to call the stencils in a for loop (although this was before I was creating new functions with dx already defined), and the result was much slower than looping over the stencil.

In your original code the loop goes over 2D CartesianIndex which corresponds to a nested loop over first and second axis. Therefore, the question is about the speed of a single nested loop versus two separate loops along each of the axes. The number of operations seems the same. Though, some slowdown can be due to sub-optimal memory access.
You can always compare the speed experimentally using BenchmarkTools package.

I tested (and retested in 1 or 2 cases) a number of different methods only using the 2nd order 5 point stencil below is a snippet of a few of them. Unless I’m calling @benchmark incorrectly, then by far the fastest method is the original, double nested loop with the 5 point stencil, which at least with the implementation below seems to be about 5x faster with a mean time of 4.012μs versus the second method with a mean time of 25.544μs. Having said that I’m sure I’ve probably written things sub-optimally.

Thanks for your help so far :slightly_smiling_face: @fedoroff .

using BenchmarkTools

n = 41
D = [0.0,1.0]
dx = D[2]/(n-1)
grid = range(0.0,1.0,length=n)

u = [x^2 for x in grid, y in grid]



#===== 2D STENCIL ======#
function SecondDerivative!(uₓₓ::Array,u::Array,Δx::Real,Δy::Real,Index::CartesianIndices)

    offx = CartesianIndex(1,0)
    offy = CartesianIndex(0,1)

    for I in Index
    uₓₓ[I-offx-offy] = 
            (u[I-offx] + 2*u[I] + u[I+offx])/(2.0Δx^2) +
            (u[I-offy] + 2*u[I] + u[I+offy])/(2.0Δy^2)
    end
    return nothing
end

Index = CartesianIndices(u)

InternalIndex = CartesianIndices((Index.indices[1][2:end-1],Index.indices[2][2:end-1]))

FixedSecondDerivative(uxx,u) = SecondDerivative!(uxx,u,dx,dx,InternalIndex)


#===== LOOP PER DIRECTION ======#
function SecondDerivative_Order2(cache,u,Δx)
   cache .= (u[1:end-2] .- 2u[2:end-1] .+ u[3:end])./Δx^2
end

FixedSecondDerivative_Order2(cache,u) = SecondDerivative_Order2(cache,u,dx)

function D_xx(cache, U,Nx,Ny)
    for iy=1:Ny-2
        @views FixedSecondDerivative_Order2(cache[:,iy],U[:,iy+1])
    end
    for ix=1:Nx-2
        @views FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])
    end
    return nothing
end

FixedD_xx(uxx,u) = D_xx(uxx,u,n,n)


#===== PER DIRECTION WITHOUT CALLING FUNCTIONS ======#
function D_xx_loops(cache, U,Nx,Ny,dx,dy)
    for iy=1:Ny-2
		cache[:,iy] .=  (u[1:end-2,iy] .- 2u[2:end-1,iy] .+ u[3:end,iy])./dx^2
    end
    for ix=1:Nx-2
		cache[ix,:] .+=  (u[ix,1:end-2] .- 2u[ix,2:end-1] .+ u[ix,3:end])./dy^2
        FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])
    end
    return nothing
end

FixedD_xx_loops(uxx,u) = D_xx_loops(uxx,u,n,n,dx,dx)


#===== FUNCTION PER NODE ======#
function SecondDerivativeSingle(u,dx)
	return (u[1] - 2u[2] + u[3])/dx^2
end

FixedSecondDerivativeSingle(u) = SecondDerivativeSingle(u,dx)

function D_xx_stencilloop(cache,U,Nx,Ny,stencil1,stencil2)
	for i = 1:Nx-2
		for j = 1:Nx-2
			cache[i,j] = stencil1(u[i,j:j+2]) + stencil2(u[i:i+2,j])
		end
	end
end

Fixed_D_xx_stencilloop(uxx,u) = D_xx_stencilloop(uxx,u,n,n,FixedSecondDerivativeSingle,FixedSecondDerivativeSingle)


#= Allocate solution vectors n-2 for internal nodes only =#
uxx1 = zeros(n-2,n-2);
uxx2 = zeros(n-2,n-2);
uxx3 = zeros(n-2,n-2);
uxx4 = zeros(n-2,n-2);


#= Benchmarking =#
@benchmark FixedSecondDerivative(uxx1,u)

@benchmark FixedD_xx(uxx2,u)

@benchmark FixedD_xx_loops(uxx3,u)

@benchmark Fixed_D_xx_stencilloop(uxx4,u)

If you run your tests in global scope, your need to interpolate the variables to avoid measuring the compilation time, e.g. (see the $ symbols),

@benchmark FixedSecondDerivative($uxx1, $u)

Also, to avoid creation of temporary arrays, when you take the slice of an array, it useful to take its view instead, e.g.

cache[i,j] = @views stencil1(u[i,j:j+2]) + stencil2(u[i:i+2,j])
@views FixedSecondDerivative_Order2(cache[ix,:],U[ix+1,:])

It will greatly increase the speed.

1 Like