# Incorporating derivative term into loss for Flux.jl model

Hello!
I am trying to incorporate a derivative term of the outputs with respect to one of the inputs in the loss of a neural network via `Flux.jl`, and I was hoping to get some help…
I looked into a few posts online, but I believe a lot of PINN-related questions involve models with a single input (e.g. time) and multiple outputs (e.g. states).
Meanwhile, what I would like to do is to train a model with multiple inputs & outputs, and a loss function that combines the MSE of the outputs & the sum of the gradients (technically speaking a jacobian?) of the outputs with respect to one of the inputs.

Below is a test code I’ve modified from this previous question, derived from `DiffEqFlux.jl`.
I am trying to train a model that takes in 7 inputs and returns 7 outputs, where the loss is based on the MSE of the 7 outputs + the sum of the derivatives of the 7 outputs with respect to the 6th input.
I am still having dimension issues, and am still unable to incorporate the MSE into the loss…
I’d appreciate any help!

``````using Flux
using ReverseDiff

#Create modified Flux NN type that carries destructured parameters
struct ModNN{M, R, P}
model::M
re::R
p::P

function ModNN(model; p = nothing)
_p, re = Flux.destructure(model)
if p === nothing
p = _p
end
return new{typeof(model), typeof(re), typeof(p)}(model, re, p)
end
end

#function to compute the derivatives of NN outputs with respect to 6th input
function diffdestruct(re, p, xs)
H = [Flux.jacobian(w -> re(p)(w), [xs[:,i]])[1] for i = 1:size(xs,2)]
return H
end

#Create an instance of modified type
mod1 = ModNN(
Chain(Dense(7 => 32, relu), Dense(32 => 32, relu), Dense(32 => 7))
)

#Record parameter values to check that they're updating correctly
checkP = copy(mod1.p);

# sample dataset
N_data = 10
xtest = rand(7,N_data)
ytest = rand(7,N_data)
@show diffdestruct(mod1.re, mod1.p, xtest);

# Loss function (simple aggregation of NN outputs)
loss(re, p, xs, ys) = sum(diffdestruct(re, p, xs)) + Flux.Losses.mse(hcat([re(p)([xs[:,i]]) for i = 1:size(xs,2)]...), ys)

#Get the gradients using Reverse diff
gs = ReverseDiff.gradient(p -> loss(mod1.re, p, xtest, ytest), mod1.p)

#Update the NN
Flux.Optimise.update!(opt,mod1.p,gs)  # optimizer, model parameters, gradients

@show maximum(mod1.p .- checkP);
``````

yields following error

``````ERROR: LoadError: DimensionMismatch("matrix A has dimensions (32,7), vector B has length 1")
``````

Actually, I’ve been able to work it out - but the way I got it to work still feels clumsy (especially with all the `hcat([something]...)` that I am having to do… I’d appreciate any suggestions to improve this!

``````using Flux
using ReverseDiff

#Create modified Flux NN type that carries destructured parameters
struct ModNN{M, R, P}
model::M
re::R
p::P

function ModNN(model; p = nothing)
_p, re = Flux.destructure(model)
if p === nothing
p = _p
end
return new{typeof(model), typeof(re), typeof(p)}(model, re, p)
end
end

#function to compute the derivatives of NN outputs with respect to 6th input
function diffdestruct(re, p, xs)
H = hcat([
Flux.jacobian(w -> re(p)(w), xs[:,i])[1][:,6] for i = 1:size(xs,2)
]...)
return H
end

#Create an instance of modified type
mod1 = ModNN(
Chain(Dense(7 => 32, relu), Dense(32 => 32, relu), Dense(32 => 7))
)

#Record parameter values to check that they're updating correctly
checkP = copy(mod1.p);

# sample dataset
N_data = 10
xtest = rand(7,N_data)
ytest = rand(7,N_data)
# Flux.jacobian(el -> mod1.re(mod1.p)(el), x)[1]
@show diffdestruct(mod1.re, mod1.p, xtest);

# Loss function (simple aggregation of NN outputs)
loss(re, p, xs, ys) = sum(diffdestruct(re, p, xs)) + Flux.Losses.mse(hcat([re(p)(xs[:,i]) for i = 1:size(xs,2)]...), ys)

#Get the gradients using Reverse diff
gs = ReverseDiff.gradient(p -> loss(mod1.re, p, xtest, ytest), mod1.p)