I am working with finite element method code. In this context, it is (very!!!) useful to use forward automatic differentiation to differentiate the function that, from degrees of freedom, compute the element’s contribution to the residual. Anyway: a function of the form
r = foo(x)
where x
, r
and everything in between are static arrays.
I started early: ForwardDiff.jl did no exist yet, so I have my homebrew which does essentially the same - I’ll refer to it as Adiff.
At the start of a differentiation process, the partials of a dual number are populated with Adiff:SVector
, ForwardDiff:Tuple
of zeros and a single one, and some can be propagated quite far into he computations, depending on the structure of foo
. Things only get worse when differentiating to higher order. I don’t know about you, but I feel adding zeros and multiplying by ones is a waste of time . So I messed around a little:
struct Nix end # structuraly zero
const nix = Nix()
@inline Base.:*( ::Nix ,b::Number) = nix
@inline Base.:+( ::Nix ,b::Number) = b
#... you get the gist
Base.show(io::IO,::Nix) = print(io,"⋅")
# make a few "sparse static" vectors
a = (nix,nix,2,nix,nix,1,nix,nix,nix,nix,nix,nix)
b = (1,nix,nix,2,nix,3,nix,4,nix,5,nix,6)
e = sum(a.*b)
#...
Both @code_native
and BenchmarkTools
suggest that this technique reduces the number of operations and computing times, and there is no penalty when full vectors are involved in the computations.
The idea now is that the compiler must produce specialised machine code for every combination of sparsity patterns in operations. This will obviously require longer computing times.
But my concern and question is: faced will generating a large number of specialised instances, will the compiler end up throwing the sponge and create a general code - with much worse performance than by just leaving well alone in Adiff and ForwardDiff? Of course the best is just to try, but at least in Adiff, that would be some work, so I am checking whether we can tell outright if this is not a good idea.