ForwarDiff of a function depending on a mask vector

lorenzo.a.ricciardi · June 7, 2021, 3:56pm

Hi all,

this is my first post here!

I’m still very new to Julia and I’m enjoying a lot learning about it. I have a simple question for which I couldn’t find an answer, I hope it wasn’t posted already somewhere else.

I’ll make a simple example that shows the problem (the actual code is much more complex, but this is the basic idea): I’m trying to differentiate the following function wrt to x

function my_fun(x,xvals,ids)

out = copy(xvals)
out[ids] = x[ids]

return out

end

Basically, the function returns a vector, some element of that vector are constant (xvals) while others (defined by the mask ids) depend on the input x.

If I have, for example, x = [0 0 0 0], xvals = [1 2 3 4] and ids=[1 2], the function seems to work correctly, but when I try to call ForwarDiff to differentiate it wrt x

zz = ForwardDiff.jacobian(x → my_fun(x,xvals,ids),x)

I get MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Int64}, Int64, 4})

I can’t really understand what the problem is, since if I try with another function that only returns the selected components:

function my_fun(x,ids)
out = x[ids]
return out

end

the differentiation works without problems.

Can someone explain me what’s wrong?

Thanks

vettert · June 7, 2021, 9:14pm

Not at the computer right now, but I think the problem is that you create your array “out” as containing floats and then you want to push dual numbers into it.

Try adding
out = convert.(eltype(x), out)
In between the lines of your function.

lorenzo.a.ricciardi · June 8, 2021, 8:53am

Hi vettert,

your reply helped me a lot. It was still not exact, as the type mismatch happens directly when I assign the values x[ids] into a vector of duals out, so I figured out the conversion has to happen directly at the creation of the vector out.

For further reference, this function can be correctly differentiated wrt x:

function my_fun(x,xvals,ids)
out = convert.(eltype(x),xvals)
out[ids] = x[ids]
return out
end

I find it a bit ugly, however. Anyone has a suggestion for improving its?

Thanks

ettersi · June 8, 2021, 10:11am

Welcome to the Julia forum, Lorenzo!

I think this is as pretty as it gets. The only way to improve on this is to look at the places where you use this function. Semantically, it seems a bit odd that you may want to copy x values into xvals if you cannot be sure that x and xvals have the same eltype.

vettert · June 8, 2021, 10:18am

Hi Lorenzo,
Great. It was a little late when I typed my post, but you managed to arrive at the right solution from it. Anyhow, I agree with ettersi that it looks a little odd to do this at all. Maybe context on why you are doing this would be illuminating?

lorenzo.a.ricciardi · June 8, 2021, 10:40am

Hi ettersi and vettert,

thanks to both of you.

As I was saying, this is a much simplified function that shows the problem I was getting.
What I’m actually doing is re-writing in Julia my old Matlab code, which solves optimal control problems using the Direct Finite elements transcription method.
In my old code, I had the possibility of specifying boundary conditions either as simple constant values (i.e. final velocity=0) and as general nonlinear functions of the boundary states (ex xf[2] = xi[1]^3/xf[4]).
Since I don’t want to have redundant variables for the boundary states that are imposed (the first case), my solution vector does not include them, and I pass the constant imposed terms as part of a user defined type that holds several other information (the transcription).
Thus, when I want to impose the general nonlinear boundary constraints, I need to reconstruct the boundary states: some of those components are constants and come from the constant user defined type, while others are stored inside the solution vector.
To solve the problem I then need the Jacobian of that nonlinear boundary terms wrt the solution vector.

Hope this makes it a bit clearer.

ettersi · June 8, 2021, 1:56pm

If I understand you correctly, your problem is somewhat like the following:

# Dummy input
c = 1
f = x->x^2
n = 5
idx = rand(1:5,3)

# Assemble the boundary condition
b = fill(c,n)
b[idx] .= f.(idx)

If this is approximately representative, then I would recommend to allocate b with the correct eltype right from the start. Perhaps the best way to do this would be the following:

T = promote_type(typeof(c), typeof(f(idx[1])))
b = fill(convert(T,c),n)

mcabbott · June 8, 2021, 2:47pm

If @ettersi’s example is correct, you may also want to think about doing this without mutating b, simply make it correctly the first time. Something like b = T[i ∈ idx ? f(i) : c for i in 1:n] – unfortunately without the T you get different promotion behaviour, which will tend to make an abstract type. (E.g. with f = sqrt the dummy input will make Vector{Real}, which will be slow.) If f is cheap, you could also do something like b = [ifelse(i ∈ idx, promote(f(i), c)...) for i in 1:n].

lorenzo.a.ricciardi · June 8, 2021, 3:11pm

The situation is more like

b = fill(c,n)
b[idx] = x[idx]
f(b) (which is nonlinear and accepts a vector, actually multiple vectors built as the above two lines)

If I understand correctly you’re getting the type x will be converted to, and generate directly out with the correct type. I guess this has a performance benefit to my a posteriori conversion, right?

mcabbott · June 8, 2021, 5:08pm

OK, then a non-mutating solution might look like mask = eachindex(x) .∈ Ref(idx); b = ifelse.(mask, x, oftype.(x,c)). Not super pretty, though. Maybe defining a function like promote_ifelse(c, x, y) = ifelse(c, promote(x,y)...) and then broadcasting that promote_ifelse.(mask, x, c) is tidier? Or define some utility like this (here complex x would have the same difficulty of not fitting into b=fill(13, 5))

julia> choose(ind, x::AbstractVector, y::Number) = map(eachindex(x)) do i
         xp, yp = promote(x[i],y)
         i in ind ? xp : yp
       end;

julia> choose([1,3], rand(5).+im, 13)
5-element Vector{ComplexF64}:
 0.23760044132852687 + 1.0im
                13.0 + 0.0im
 0.25750298249695736 + 1.0im
                13.0 + 0.0im
                13.0 + 0.0im

ettersi · June 9, 2021, 1:24am

Your non-mutating version is O(n * length(idx)) while my mutating version is O(n + length(idx)). I’m assuming the reason why you want to avoid mutation is because this may lead to issues with reverse mode differentiation?

mcabbott · June 9, 2021, 2:37am

That’s true. None of my variants end up very neat; having to call promote_type etc. yourself always feels a bit low-level, so I just wondered if there was some tidier high-level way… but perhaps there isn’t.

lorenzo.a.ricciardi · June 9, 2021, 8:08am

I tried with another approach, much simpler than the ones you suggested, but in the same spirit of creating the variable with the right type straight away

function my_fun(x,xvals,ids,ids2)
out = zeros(eltype(x),length(xvals))
out[ids] = x[ids]
out[ids2] = xvals[ids2]
return out
end

Not sure if it’s more readable than my option1, but it probably has better performance since it creates the right variable straight away without converting it

Topic		Replies	Views
[ForwardDiff Question]: Jacobian of a function that calls another function General Usage question , forwarddiff	6	851	July 20, 2020
ForwardDiff: Error ForwardDiff.Tag How to transform a Dual Type to a Float General Usage forwarddiff	9	3061	September 10, 2020
Create ForwardDiff.Dual is slow Specific Domains package	5	94	November 26, 2024
Problem with ForwardDiff and pre-allocating vector Numerics	6	1398	December 14, 2017
ForwardDiff caches usage General Usage forwarddiff	5	969	July 5, 2021

ForwarDiff of a function depending on a mask vector

Related topics