[ForwardDiff] how to consider expression as constant

See the following code. Inside the function f, g(x) is called and I need the result to be used as both a variable and as constant.
Now of course I can calculate x20=g(x) outside and provide this as an extra argument to the f function, but in this case g(x) is calculated twice. In my real problem, g(x) is an expensive function and I would avoid to call it twice. How can I achieve this?

using ForwardDiff

g = x->x^2

function f(x, y0)
    x2 = g(x)
    x20 = x2 # how to treet this as constant?

    return (x2 - x20*y0)^2
end

ForwardDiff.derivative(x->f(x, 2), 1)

ForwardDiff will make x2 be of type Dual containing the fields value and partials. So I think just accessing that should work, e.g. x20 = x2.value, though this relies on internals and might change (though I doubt it will change soon at least). It will also make you unable to call it with a normal value.

Looking at the source one can find a function ForwardDiff.value(d) which on a normal value just returns itself, and on a dual it returns the value field, though this is also undocumented so maybe equally “unreliable” as accessing the value field. But it will make your function f able to be called both normally and through ForwardDiff.

Can’t seem to find a documented way of doing this, maybe ForwardDiff.value should be documented as API?

I have encountered a small problem when x is an array. Using value.(x2) gives the correct result in this case. However, I’m not sure if this solution is the right one.
I would also expect the ForwardDiff.value(array) to raise an error instead of silently giving an un-expected result.

using ForwardDiff

g = x->x^2

function f(x, y0)
    x2 = g.(x)
    # x20 = ForwardDiff.value(x2) # this does not work
    x20 = ForwardDiff.value.(x2) # this works

    return sum((x2 - x20*y0).^2)
end

ForwardDiff.derivative(x->f(x, 2), 1.f0) # returns -4.f0
ForwardDiff.gradient(x->f(x, 2), [1.f0]) # returns array [-4.f0]

Yeah, seems like the implementation only handles Dual and not Vector{Dual}, and so the vector will fall back to just being returned.

@inline value(x) = x               # Vector{Dual} falls back to this one
@inline value(d::Dual) = d.value

Maybe this is not the way to go then since it does not seem like the internal function value was intended for this. Though you could easily just create your own function, based on that implementation, and extend it to handle vectors.

Haven’t tested it, but something like this could probably work.

@inline value(x) = x                        
@inline value(d::Dual) = d.value
@inline value(d::Vector{<:Dual}) = value.(d)

Here is the updated MWE, based on your suggestion. Works as expected.

[Re-edited code]:

using ForwardDiff

@inline value(d::Vector{<:ForwardDiff.Dual}) = ForwardDiff.value.(d)

g = x->x^2

function f(x, y0)
    x2 = g.(x)
    x20 = value(x2) # treat this as a constant

    return sum((x2 - x20*y0).^2)
end

ForwardDiff.derivative(x->f(x, 2), 1.f0)
ForwardDiff.gradient(x->f(x, 2), [1.f0, 2.f0])

Yes, that should work.

There is an argument for creating your own function, instead of extending the one in forwarddiff with a new method. IIUC this should be considered type piracy, since you extend functions you don’t own with types you don’t own.
If you are only creating local scripts for you this doesn’t really matter, but if you want to make a package and share the code it is usually frowned upon to do this since it can change peoples code in unexpected ways.

1 Like