Hi at all,
I am currently struggling with three (competing?) goals on code design. I want code, that: (1) is fast, (b) is differentiable and (c) differentiable by current (and future) AD-libraries (like at least ForwardDiff.jl and ReverseDiff.jl). And last but not least, easy maintainability (which is somehow connected to readable code) would be a thing
I am facing some issues, thinking about how to make good code with that goals in mind. I am just sharing my thoughts and I am happy about corrections, comments or updates on them
(1) Using Buffers
One of the core design patterns to make code fast (independent of Julia) is using buffers, that are allocated once and used e.g. during computations inside of a loop. Often, this doesn’t add that much complexity to the code. In Julia, I want static-typed buffers (for performance) so I need to preallocate with a given type (like Float64), however in an AD-run, I need the corresponding AD-primitives (like e.g. ForwardDiff.Dual or ReverseDiff.TrackedReal). What is the “nice” way of doing this?
(1a) I know the PreallocationTools.jl, but as I understand it’s for ForwardDiff only. However, there is now ReverseDiff as optional dependency… is there a ReverseDiff-support planned maybe?
EDIT: ReverseDiff.jl works with PreallocationTools.jl!
(1b) Zygote.jl seems complicated with buffers, because of allowing only non-mutable array operations. However, there is the new Zygote.Buffer
- but this feels like additional code is necessary, especially for Zygote.
(2) Common AD-interface: ChainRules.jl
I really like the idea of ChainRules.jl and that deploying one forward and one backward rule (frule
and rrule
) is mathematically enough to build an interface to further AD-“backends”. Adding custom rules is not just for non-differntiable foreign calls, but also necessary for performance (e.g. AD-shortcuts over iterative procedures). However, I see multiple (big!) libraries, that don’t use ChainRules.jl and decide to implement dedicated dispatches for AD-primitives instead. Is this because ChainRules adds overhead compared to a pure AD-dispatch (like for ForwardDiff.Dual)?
Finally, it might be target-oriented to discuss this at an example:
# a struct, that "lives" some time during the application
# and stores values and buffers, that are reused (it's mutable)
mutable struct LongLifeStruct
# the type of `a` will change during the application
# `AbstractArray{<:Real}` is bad, but works with AD, because AD-primitives are
# subtypes of Real. I assume PreallocationTools is the way to go here?
a::AbstractArray{<:Real}
end
function doSomething(str::LongLifeStruct)
str.a[:] = ... # whatever
end
# a struct, that is allocated for a special calculation and
# freed afterwards (immutable)
struct ShortLifeStruct{T}
# the type of `a` will not change during its "lifespan"
# therefore we can allocate it "typed"
# is this correct?
a::AbstractArray{T}
end
What would be (special cases neglected) the “correct” (=fast) way of implementing AD-support (ForwardDiff, ReverseDiff, Zygote, …) for the little code example above?
Thank you all!