State of reverse mode AD tools

Tamas_Papp · March 4, 2019, 4:10pm

(Opening a new topic instead of resurrecting some older ones.)

I am wondering about the current state of libraries for reverse mode AD, specifically for differentiating \mathbb{R}^n \to \mathbb{R} functions that

accept an AbstractVector and return something Real,
may contain branches (in my experience, everything nontrivial does),
n is too large for a single SVector (think 1-6k), but the input is broken up to smaller pieces (groups of parameters, hyperparameters, etc) that may contain SVectors.

While I am following developments with interest, I am mainly interested in libraries that work, or can be made to work with a reasonable effort (eg reporting an issue and getting an answer within a few days with a solution or hints on how to make a PR) at the moment.

My experience is the following:

ForwardDiff is of course forward mode, but super-robust and a good fallback. Surprisingly, it is quite competitive for large n, with some tricks (chunk size, etc, detailed in the manual).
ReverseDiff is reliable, but pre-compiling the tape quickly gets one in trouble unless the code path is 100%-and-I-mean-it deterministic. Recreating a tape each time slows it down.
Flux is reliable, but a bit more picky about code being really generic.
Nabla requires that code works with types which are not necessarily <: AbstractVector, this makes it difficult to use.
Zygote is fast when it works, but I found that breaking issues are not fixed for months, so in practice it is not usable.

I have not tried Yota, or the other libraries. I am curious to hear stories from AD users — maybe I missed something obvious.

dfdx · March 4, 2019, 11:18pm

Can you provide a couple of examples of branching you meet in practice? I have a couple of ideas how to add support for dynamic graphs to Yota (albeit with lower performance), but without real use cases they might be waste of time.

Tamas_Papp · March 5, 2019, 7:06am

My experience is that it is very difficult not to run into (insidious, well-hidden) branches for any sufficiently complex and nonlinear calculation of otherwise (mathematically) continuous functions.

A simple example would be anything calling StatsFuns.log1pexp.

Even if branches are not handled, it would be great if the user got an error when a path that is different from the taped one is taken. Perhaps this can be done by turning branches into assertions that throw a BranchOutsideTapeError, which the user could catch and recompile the tape.

Of course the other workaround is to code AD primitives for all of these functions. But they are so easy to miss in practice, so the error above would be better than giving an incorrect result silently.

chakravala · March 5, 2019, 8:27am

My goal is to have automatic differentiation support using multivariable dual numbers with Grassmann.jl

Once that’s implemented, things will get much more interesting for me…

Although I don’t work with AbstractVector but TensorAlgebra instead, and instead of Real I have a separate basis for scalar values (from which you get real values contained inside)

MikeInnes · March 5, 2019, 9:06am

Can you point me to your issue / bump it on github? We’re definitely still in a beta stage so can’t make any promises, but if I can often prioritise stuff that’s blocking people’s work.

xor0110 · March 5, 2019, 9:57am

Your link has a typo, “Gassmann”

Tamas_Papp · March 5, 2019, 10:25am

It is

github.com/FluxML/Zygote.jl

deepcopy of Modules not supported

opened 09:38AM - 19 Dec 18 UTC

closed 10:26AM - 06 Mar 19 UTC

tpapp

I am trying to add support for AD via Zygote for LogDensityProblems in this PR: …https://github.com/tpapp/LogDensityProblems.jl/pull/24, but [this line (currently skipped)](https://github.com/tpapp/LogDensityProblems.jl/blob/tp/zygote/test/runtests.jl#L158) fails with ```julia julia> @test logdensity(ValueGradient, ∇ℓ, x) ≅ ValueGradient(f(x), -6 .* x) Error During Test at REPL[28]:1 Test threw exception Expression: logdensity(ValueGradient, ∇ℓ, x) ≅ ValueGradient(f(x), -6 .* x) Compiling Tuple{typeof(logdensity),Type{Value},TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)},Array{Float64,1}}: deepcopy of Modules not supported Stacktrace: [1] error(::String) at ./error.jl:33 [2] deepcopy_internal(::Module, ::IdDict{Any,Any}) at ./deepcopy.jl:34 [3] _deepcopy_array_t(::Any, ::Type, ::IdDict{Any,Any}) at ./deepcopy.jl:91 [4] deepcopy_internal(::Array{Any,1}, ::IdDict{Any,Any}) at ./deepcopy.jl:78 [5] deepcopy_internal(::Any, ::IdDict{Any,Any}) at ./deepcopy.jl:67 [6] _deepcopy_array_t(::Any, ::Type, ::IdDict{Any,Any}) at ./deepcopy.jl:91 [7] deepcopy_internal at ./deepcopy.jl:78 [inlined] [8] deepcopy at ./deepcopy.jl:28 [inlined] [9] Core.Compiler.IRCode(::IRTools.Meta) at /home/tamas/.julia/packages/IRTools/yAuDJ/src/ir/wrap.jl:104 [10] _lookup_grad(::Type) at /home/tamas/.julia/packages/Zygote/Ohw1K/src/compiler/emit.jl:124 [11] #s53#851 at /home/tamas/.julia/packages/Zygote/Ohw1K/src/compiler/interface2.jl:17 [inlined] [12] #s53#851(::Any, ::Any, ::Any) at ./none:0 [13] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at ./boot.jl:522 [14] #3 at /home/tamas/code/julia/LogDensityProblems/src/LogDensityProblems.jl:237 [inlined] [15] (::Zygote.Pullback{Tuple{getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)}},Array{Float64,1}},Tuple{getfield(Zygote, Symbol("##222#back#144")){getfield(Zygote, Symbol("##142#143")){Zygote.Context,getfield(LogDensityProblems, Symbol("##3#4")){TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)}},Symbol,TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)}}},Zygote.Pullback{Tuple{typeof(logdensity),Type{Value},TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)},Array{Float64,1}},Tuple{typeof(logdensity)}},getfield(Zygote, Symbol("##234#back#147")){getfield(Zygote, Symbol("#back#146")){:value,Zygote.Context,Value{Float64},Float64}}}})(::Int8) at /home/tamas/.julia/packages/Zygote/Ohw1K/src/compiler/interface2.jl:0 [16] #66 at /home/tamas/.julia/packages/Zygote/Ohw1K/src/compiler/interface.jl:38 [inlined] [17] logdensity(::Type{ValueGradient}, ::LogDensityProblems.ZygoteGradientLogDensity{TransformedLogDensity{TransformVariables.ArrayTransform{TransformVariables.Identity,1},typeof(f)}}, ::Array{Float64,1}) at /home/tamas/code/julia/LogDensityProblems/src/AD_Zygote.jl:20 [18] top-level scope at none:0 [19] eval(::Module, ::Any) at ./boot.jl:328 [20] eval_user_input(::Any, ::REPL.REPLBackend) at /home/tamas/src/julia-git/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:85 [21] run_backend(::REPL.REPLBackend) at /home/tamas/.julia/packages/Revise/gStbk/src/Revise.jl:771 [22] (::getfield(Revise, Symbol("##58#60")){REPL.REPLBackend})() at ./task.jl:259 ERROR: There was an error during testing ``` I realize the example is far from minimal, but if I am doing something obviously wrong, I would appreciate advice about it.

MikeInnes · March 5, 2019, 10:45am

Thanks. This will be next in the queue when I’m looking through Zygote issues next.

Tamas_Papp · March 5, 2019, 10:51am

Thanks. I have an open PR that I could merge for LogDensityProblems.jl once this is fixed, which would allow the use of Zygote in MCMC.

Topic		Replies	Views
ReverseDiff.jl Community package , announcement	9	1696	December 13, 2017
How to: High-performance differentiable programming with broad AD-library support? Performance zygote , forwarddiff , reversediff , ad	13	602	May 6, 2024
Comparison of automatic differentiation tools from 2016 still accurate? Numerics differentiation	41	5819	August 16, 2018
Which autodiff to currently use for a neural network backend? General Usage package , statistics , machinevision	10	2160	October 1, 2018
Mixed-mode automatic differentiation using ForwardDiff and ReverseDiff General Usage forwarddiff , reversediff , autodiff	9	2718	February 1, 2022

State of reverse mode AD tools

Related topics