Efficient pushforward (Jacobian vector product) with multiple vectors using automatic differentiation


I have written some code that computes the pushforward of a time-stepping function with multiple vectors:

julia> function my_step_pushforward!(u, Ψ, Δt)
           rank = size(Ψ)[2] # Ψ is an n X rank matrix
           Ψ_new = similar(Ψ)
           perturb(r) = my_step!(u + Ψ*r, Δt)
           result = DiffResults.DiffResult(u, Ψ_new)
           result = ForwardDiff.jacobian!(result, perturb, zeros(rank))
           Ψ .= Ψ_new
my_step_pushforward! (generic function with 1 method)

where my_step! is some nonlinear time-stepping function for a system of ODEs. For some reason, there is a big degradation in performance when Ψ goes from having one column to two: (for zero columns I default to just using my_step!)

rank Ψ   simulation time
0       | 3.117320
1       | 3.389685
2       | 5.733118
3       | 8.761905

So the first column of Ψ only adds 0.3 seconds to the total run time, but the second and third columns each add around 3 seconds! I suspect that when the Jacobian is a column vector, it is only doing one function evaluation, but when the Jacobian has more than one column it is doing extra function evaluations to compute the other columns after the first one. From what I understand about forward mode differentiation, this should be possible with only one function evaluation, right?

I have two questions: Am I right in assuming that extra function evaluations are at play here? And, is there a better way to compute the JVP with multiple vectors, using only a single function evaluation?

I suspect that when the Jacobian is a column vector, it is only doing one function evaluation, but when the Jacobian has more than one column it is doing extra function evaluations to compute the other columns after the first one.

Your diagnosis is right, with forward mode AD you need as many function calls as there are columns in the Jacobian (ie the input dimension). Whereas with reverse mode AD you need as many function calls as there are rows in the Jacobian (ie the output dimension).

1 Like

So there’s no way around this then? I thought this was why the ForwardDiff.Dual can track multiple partials? The function being differentiated accepts a vector of Duals, each with multiple partials; I assumed that meant the vector dimension was the rows of the Jacobian and the partials dimension was the columns…

1 Like

From what I understand, ForwardDiff partially mitigates the issue with a kind of “batch evalution” of all directional derivatives. So you’re right, that’s what the multiple partials in the type allow. But in the end you still have to make n calls to the function, even if they are somehow fused or parallelized (I’m unsure about the internals)
EDIT: Steven explained it better :wink:

1 Like

It’s still n times as expensive for n inputs — technically, it’s only doing one function call, but the function is much more expensive because it’s doing arithmetic on “numbers” with n+1 real components.


Sure, but why is the first column so cheap compared to subsequent columns in my implementation?

That is really hard to say without a full reproducible example, could you provide one?

Yes, I’ll get to work on one. I really want to figure this out, it has been bugging me for months lol.
Edit: I may have to give up on this. I can’t seem to reproduce the behavior on a smaller code, and I’ve been working on this all day now!

1 Like

By the way this has nothing to do with your problem and it’s mostly nitpicking, but Julia syntax conventions would have you start with the arguments being mutated for functions with a ! (so in this case \Psi)

Is your function conditional free? Can you tolerate a one time analysis step to generate a fast jacobian vector executable? If so then you may be able to use FastSymbolicDifferentiation.jl. I just added the Jacobian vector product function today.

It’s still in beta so not something to use in production software. I’m looking for new functions to test and benchmark the software on. If you want to try it out I’ll help you get going.

1 Like

If so then you may be able to use FastSymbolicDifferentiation.jl.

This is interesting… My function involves interpolation with moving interpolation nodes, so I’m not sure if that is conditional free. I have tried tracing the function with Symbolics.jl, but it fails at the interpolation. I’ll give this a try and let you know how it works!

If something breaks or the documentation isn’t clear let me know and I’ll fix it.

1 Like

It looks like tracing the function fails due to Interpolations.jl. Not really sure what I can do about that, since if I wrote my own interpolation function it would surely require conditionals. The second part of the function not shown here involves FFTs, which is likely even harder to trace…

using FastSymbolicDifferentiation, Symbolics
using ForwardDiff, DiffResults, PreallocationTools, Interpolations
using BenchmarkTools

struct SimApp
    # variables

    # coordinates
SimApp(nx, Δt) = SimApp(nx, Δt, 2π / nx, range(0., 2π*(1-1/nx), nx))

function step(y, app::SimApp)
    # initialize interpolation
    itp = interpolate(y, BSpline(Linear(Periodic())))
    sitp = scale(itp, app.x)
    extp = extrapolate(sitp, Periodic())

    # interpolate solution at departure points
    x_back = app.x .- app.Δt * y
    y_intp = extp.(x_back)

    return y_intp
julia> @variables u[1:64];

julia> nu = [Node(u_i) for u_i ∈ u];

julia> app = SimApp(64, 0.02);

julia> fs = step!(nu, app)
TypeError: non-boolean (SymbolicUtils.BasicSymbolic{Bool}) used in boolean context

 [1] is_zero
   @ ~/Projects/FastSymbolicDifferentiation/FastSymbolicDifferentiation.jl/src/ExpressionGraph.jl:113 [inlined]
 [2] simplify_check_cache(#unused#::typeof(*), na::Float64, nb::Node{SymbolicUtils.BasicSymbolic{Real}, 0}, cache::IdDict{Any, Any})
   @ Main.FastSymbolicDifferentiation ~/Projects/FastSymbolicDifferentiation/FastSymbolicDifferentiation.jl/src/ExpressionGraph.jl:161
 [3] *(a::Float64, b::Node{SymbolicUtils.BasicSymbolic{Real}, 0})
   @ Main.FastSymbolicDifferentiation ~/.julia/packages/SymbolicUtils/H684H/src/methods.jl:54
 [4] Interpolations.BSplineInterpolation(#unused#::Type{Float64}, A::OffsetArrays.OffsetVector{Node{SymbolicUtils.BasicSymbolic{Real}, 0}, Vector{Node{SymbolicUtils.BasicSymbolic{Real}, 0}}}, it::BSpline{Linear{Periodic{OnCell}}}, axs::Tuple{OffsetArrays.IdOffsetRange{Int64, Base.OneTo{Int64}}})
   @ Interpolations ~/.julia/packages/Interpolations/nDwIa/src/b-splines/b-splines.jl:104
 [5] interpolate
   @ ~/.julia/packages/Interpolations/nDwIa/src/b-splines/b-splines.jl:166 [inlined]
 [6] interpolate
   @ ~/.julia/packages/Interpolations/nDwIa/src/b-splines/b-splines.jl:189 [inlined]
 [7] step!(y::OffsetArrays.OffsetVector{Node{SymbolicUtils.BasicSymbolic{Real}, 0}, Vector{Node{SymbolicUtils.BasicSymbolic{Real}, 0}}}, app::SimApp)
   @ Main ~/Projects/FastSymbolicDifferentiation/trace_kflow_step.ipynb:22
 [8] top-level scope
   @ ~/Projects/FastSymbolicDifferentiation/trace_kflow_step.ipynb:4

This happens if I try to trace with just the symbols from Symbolics.jl:

julia> fs = step(collect(u), app)
MethodError: no method matching unsafe_trunc(::Type{Int64}, ::Num)

Closest candidates are:
  unsafe_trunc(::Type{T}, !Matched::Integer) where T<:Integer
   @ Base int.jl:604
  unsafe_trunc(::Type{T}, !Matched::BigFloat) where T<:Integer
   @ Base mpfr.jl:321
  unsafe_trunc(::Type{Int64}, !Matched::Union{Float16, Float32, Float64})
   @ Base float.jl:335

  [1] fast_trunc(#unused#::Type{Int64}, x::Num)
    @ Interpolations ~/.julia/packages/Interpolations/nDwIa/src/utils.jl:40
  [2] positions(deg::Linear{Periodic{OnCell}}, ax::Base.OneTo{Int64}, x::Num)
    @ Interpolations ~/.julia/packages/Interpolations/nDwIa/src/b-splines/linear.jl:47
  [3] weightedindex_parts(fs::Tuple{typeof(Interpolations.value_weights)}, deg::Linear{Periodic{OnCell}}, ax::Base.OneTo{Int64}, x::Num)
    @ Interpolations ~/.julia/packages/Interpolations/nDwIa/src/b-splines/indexing.jl:149
  [4] weightedindex_parts
    @ ~/.julia/packages/Interpolations/nDwIa/src/b-splines/indexing.jl:145 [inlined]
  [5] map3argf
    @ ~/.julia/packages/Interpolations/nDwIa/src/b-splines/indexing.jl:70 [inlined]
  [6] weightedindexes
    @ ~/.julia/packages/Interpolations/nDwIa/src/b-splines/indexing.jl:66 [inlined]
  [7] BSplineInterpolation

I’ll hack on it tomorrow. My guess is there is a conditional in interpolation somewhere so this probably isn’t something FastSymbolicDifferentiation can handle, yet.