Difficulty when differentiating with DifferentiationInterface.jl and Mooncake.jl

Hello everyone. I am stuck with the following problem for which I have created below a MWE. It concerns automatic differentiation using DifferentiationInterface.jl together with Mooncake.jl. From the error message I get, I am guessing that I am doing something wrong with the signatures of my functions, but I can’t seem to find the problem. This is not a problem of either DifferentiatioInterface or Mooncake.

Basically, in the MWE below, I have an objective function that depends on the model passed to it. The model implements a simple calculation. In my actual application, I will have multiple models that I want to pass to the objective function. Here I show two contrived models, Model1 and Model2.

  • Method outerfunction represents a flexible implementation in the sense that the calculation of the objective function depends on the model passed to it. Unfortunately, this doesn’t work as I thought it would and this is my problem.
  • Method outerfunction_cos represents an implementation where the model has been “hard coded” into it, in this case model 2, and works indeed as expected.
using DifferentiationInterface
import Mooncake

abstract type AbstractModel end

# contrived model definitions
struct Model1<:AbstractModel end
calculate(::Model1, x) = sin(x)

struct Model2<:AbstractModel end
calculate(::Model2, x) = cos(x)


# This function is represents the intented functionality.
# It should be flexible as to the chose model .
function outerfunction(y, model::AbstractModel)

    function mock_objective(x)
        
        sum(abs2.(y .- calculate.(model, x)))

    end

    backend = AutoMooncake(; config=nothing)

    x = randn(size(y))

    prep = prepare_gradient(mock_objective, backend, x)

    gradient(mock_objective, prep, backend, x)

end


# This function hard codes the model
function outerfunction_cos(y)

    function mock_objective(x)
        
        sum(abs2.(y .- cos.(x)))

    end

    backend = AutoMooncake(; config=nothing)

    x = randn(size(y))

    prep = prepare_gradient(mock_objective, backend, x)

    gradient(mock_objective, prep, backend, x)

end

This is how I call the above code:

# call above
y = randn(10)
outerfunction_cos(y)       # works as expected
outerfunction(y, Model1()) # fails
outerfunction(y, Model2()) # fails

Despite the error message that I see in my terminal, I can’t properly interpret it.
In brief, it’s a MooncakeRuleCompilationError error.

Could somebody please help with this? Many thanks in advance.

Never mind. I think I found my error. Sorry for taking people’s time. I will delete this post in a few minutes. Sorry!

Chances are (as you possibly already discovered) you need something like calculate.(Ref(model), x), to only broadcast over x:

julia> model = Model1(); calculate.(model, rand(2))
ERROR: MethodError: no method matching length(::Model1)
The function `length` exists, but no method is defined for this combination of argument types.
...

julia> model = Model1(); calculate.(Ref(model), rand(2))
2-element Vector{Float64}:
 0.6703962401999668
 0.02555859497947358

julia> y = randn(10); outerfunction(y, Model1())  # after changing mock_objective to use calculate.(Ref(model), x)
10-element Vector{Float64}:
 -2.0454384710671905
  0.7536118906802644
  0.02433927361845278
 -0.6611984902692523
  2.4331966348683114
  0.32002204681925306
 -0.25881730669622205
 -0.2741795111891973
  0.08683262936815833
  0.7966843475386596

In general I would always suggest to just also post your fix, instead of deleting your post. You never know who will find it useful :slight_smile:

6 Likes

Thank you very much for your answer @eldee. You are spot on, this was one problem with the original code, but there is another deeper problem which my MWE doesn’t replicate. Maybe someone does find the above useful…

1 Like