Fast Interpolations

I need linear interpolation on a grid to be fast and work with ForwardDiff. The Interpolations.jl package appears to be extremely fast (especially for a large number of interpolations), however it is not compatible with ForwardDiff duals. Since my case is extremely simple, I was hoping to write a similar performance simple grid interpolation function that would accept duals, however, I am not able to come anywhere close to matching the performance of Interpolations.jl. Does anyone see what I’m doing wrong that leads me to perform so poorly by comparison?

For a trivial example, the respective performances for my own code and Interpolations.jl are:

26.290 ms (200002 allocations: 3.81 MiB)
2.776 ms (4 allocations: 781.36 KiB)

Code below:

using Interpolations

function make_my_interp(ms,lb,ub,ys)
    xs=lb:ms:ub
    function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        return ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
    end
    return x -> fxn(x)
end

lb=-10.0
ub=10.0
ms = .1
yvals = [xx^2 for xx in lb:ms:ub]
my_interp = make_my_interp(ms,lb,ub,yvals)

function vector_interp(N)
	x=zeros(N)
    @inbounds for ii=1:N
        x[ii] = my_interp(3.05)
    end
    return x
end

pkg_interp = LinearInterpolation(lb:ms:ub,yvals)

N=100_000
v_input=3.05 .* ones(N)
@btime vector_interp(N)
@btime pkg_interp(v_input)

Any advice on how I can approach Interpolations.jl performance and/or use ForwardDiff duals with Interpolations.jl would be greatly appreciated. Thank you!

You use a lot of non-const global variables, usage of those will skew your benchmark

lb=-10.0
ub=10.0
ms = .1
yvals = [xx^2 for xx in lb:ms:ub]
my_interp = make_my_interp(ms,lb,ub,yvals)

Set all those to const and go from

julia> @btime vector_interp(N)
  7.489 ms (199491 allocations: 3.81 MiB)

julia> @btime pkg_interp(v_input)
  981.095 μs (2 allocations: 781.33 KiB)

to

julia> @btime vector_interp(N);
  583.640 μs (2 allocations: 781.33 KiB)

julia> @btime pkg_interp(v_input);
   977.252 μs (2 allocations: 781.33 KiB)

Just a small comment, the above statement is equivalent to just return fxn

@baggepinnen, thanks!

I was (perhaps incorrectly) trying to create a closure so that I could produce many different interpolation functions. Is there a way to do so given your method of calling it a const? Thanks!

If you would define all the constants inside make_my_interp function, but outside the closure, that would be equally fast to having them global const.

@baggepinnen, thanks for your response. That is a bit surprising to me. I thought const declarations inside functions isn’t supposed to affect performance. Moreover, I thought that the purpose of a closure would be that the current values of ms,lb,ub,yvals get hard-coded into fxn and cannot be modified in the future…

No, if you define the variables inside the function you do not have to declare them const. Const is only needed when they are global, as julia cannot know if they will be modified. A closure kind of remembers the variables it closes over, but this does not change the fact that those are globals. The closure can still modify them and see the changes to them if they are modified from somewhere else, at least if they are of pointer types.

1 Like

@baggepinnen, thank you. I learn something new every day.

@baggepinnen, I must either be misunderstanding or doing something wrong. I tried this using v0.6.4 and the following make_my_interp function and still got poor performance.

function make_my_interp2(ms_in,lb_in,ub_in,ys_in)
	const ms=ms_in
	const lb=lb_in
	const ub=ub_in
	const ys=ys_in
    const xs=lb:ms:ub
    function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        return ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
    end
    return x -> fxn(x)
end

Was I misunderstanding what you meant by declaring the constants inside the function but outside the closure?

Ah, maybe I suggested adding them to the wrong function, try this

using BenchmarkTools
using Interpolations

function make_my_interp(ms,lb,ub,ys)
    xs = lb:ms:ub
    function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        return ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
    end
    return x -> fxn(x)
end


function vector_interp(N)
	x         = zeros(N)
    lb        = -10.0
    ub        = 10.0
    ms        = .1
    yvals     = [xx^2 for xx in lb:ms:ub]
    my_interp = make_my_interp(ms,lb,ub,yvals)
    @inbounds for ii=1:N
        x[ii] = my_interp(3.05)
    end
    return x
end

lb        = -10.0
ub        = 10.0
ms        = .1
yvals     = [xx^2 for xx in lb:ms:ub]
pkg_interp = LinearInterpolation(lb:ms:ub,yvals)

N = 100_000
v_input = 3.05 .* ones(N)

@btime vector_interp($N)
  1.516 ms (4 allocations: 783.17 KiB)
@btime $(pkg_interp)($v_input)
  802.037 μs (2 allocations: 781.33 KiB)

Now, the benchmark for vector_interp includes setting up yvals, so it’s a bit slower.

Add @inline and @inbounds according to

function make_my_interp(ms,lb,ub,ys)
    xs = lb:ms:ub
    @inline function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        return @inbounds ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
    end
    return fxn
end

and get
@btime vector_interp($N) = 375.324 μs (4 allocations: 783.13 KiB)

Add @fastmath also

function make_my_interp(ms,lb,ub,ys)
    xs = lb:ms:ub
    @inline function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        return @inbounds @fastmath ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
    end
    return fxn
end

and get
@btime vector_interp($N) = 92.623 μs (4 allocations: 783.13 KiB)

Note that some of these modification might trigger SIMD instructions to be executed, which I believe requires you to start julia with julia -o 3

What’s wrong with interpolation and ForwardDiff? If that package accepts arbitrary scalar types it should work ok. Otherwise I’m sure a bug report to Interpolations.jl would be appreciated.

That’s already there: #213

@baggepinnen, thanks for the suggestions! I was not able to get your level of performance improvement with vector_interp, but I was able to get some performance improvement (to a bit more than 2x faster than pkg_interp) by using:

function make_my_interp3(ms_in,lb_in,ub_in,ys_in)
	const ms=ms_in
	const lb=lb_in
	const ub=ub_in
	const ys=ys_in
    const xs=lb:ms:ub
    function fxn(x)
        l_ind = 1+floor(Int,(x-lb)/ms)
        @inbounds @fastmath out= ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
		return out
    end
    return fxn
end

function vector_interp(f,N)
	x=zeros(N)
	@inbounds for ii=1:N
		x[ii] = f(3.05)
	end
	return x
end

my_interp = make_my_interp3(ms,lb,ub,ys)
@btime vector_interp($my_interp,$N)

I get a mean performance of 1.122 ms from this version, compared to a mean-performance of 2.798 ms from pkg_interp. (I don’t know why my processor is slower now than yesterday.) Thus, my version now takes about 40% the time of the package version, which is great but not nearly the 9x speedup you managed to get.

What Julia version are you using? One thing I noticed is that you were able to use

        return @inbounds @fastmath ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms

When i do that, I get a returned Void and therefore need to instead use:

@inbounds @fastmath out= ys[l_ind] + (ys[l_ind+1] - ys[l_ind])*(x-xs[l_ind])/ms
		return out

This makes me suspect that we are using different Julia versions, since my compiler rejects your syntax. If you’re using v0.7.0 and getting these huge performance gains, maybe it’s time for me to switch :-).

@antoine-levitt, as @shorty66 pointed out, the incompatibility of ForwardDiff is a long known issue, and I don’t know of any plans to modify Interpolations to allow use of ForwardDiff. One solution that comes to mind is that the gradients for interpolated functions are well defined and easy to compute (except at the nodes). With a bit of familiarity with Dual Number math, one could probably simply overload the interpolation function such that it performs the correct Dual operations given an input of type ::ForwardDiff.Dual or ::Vector{ForwardDiff.Dual}. This would require some careful thought, but I suspect it wouldn’t require any actual structural changes for Interpolations.jl. Moreover, it could actually perform better than using true ForwardDifferencing on the interpolation function if many of the parameters are known to have zero-gradient for the interpolated function.

I’m using julia v0.7, might be worth trying :wink:

const inside a function still does not change the performance at all.

2 Likes