How can I differentiate a subset of the outputs of a neural network in Flux or Lux?

DoktorMike · August 30, 2023, 6:32am

Is there a way in Flux or Lux to differentiate a subset of the outputs of a Neural Network model? The reason why I am asking is that I have a target matrix with 6 outputs and 3000 observations (very little data) but the bad thing is that I have quite a few missing here and there in the target matrix. Thus at any given batch I would like to backpropagate the errors from all the known targets while ignoring the missing. Is there a way to do that? Happy to provide some dummy code for it if it helps.

gdalle · August 30, 2023, 8:40am

I guess you’re gonna have to do manual autodiff using Zygote, defining the loss function yourself as the sum of errors for all non-missing outputs

DoktorMike · August 30, 2023, 9:14am

I’ll give that a go. Thanks.

iHany · August 31, 2023, 12:03am

Could you share an example if you succeed?
At first I was thinking that it’s trivial and then I couldn’t find the solution at a glance

ToucheSir · August 31, 2023, 12:24am

You may be interested in some of the previous discussions on here and GitHub about masking in losses. Those threads should have code examples

DoktorMike · August 31, 2023, 6:33am

So this seems to work but my loss function is a horrible hack and will for sure run extremely slow on a GPU.

function loss2(y, ŷ)
    l(x, x̂) = sum((x - x̂) .^ 2)
    totloss = 0
    for j in 1:(size(y)[1])
        for i in 1:(size(y)[2])
            if !ismissing(y[j, i])
                totloss = totloss + l(y[j, i], ŷ[j, i])
            end
        end
    end
    totloss / prod(size(y))
end
model = Chain(Dense(size(X)[1] => 10), Dense(10 => size(Y)[1]))
opt_state = Flux.setup(AdamW(0.005), model)
@info loss2(Y, model(X))
for e in 1:10
    # Calculate the gradient of the objective
    # with respect to the parameters within the model:
    grads = Flux.gradient(model) do m
        result = m(X)
        loss2(Y, result)
    end
    # Update the parameters so as to reduce the objective,
    # according the chosen optimisation rule:
    Flux.update!(opt_state, model, grads[1])
    @info loss2(Y, model(X))
end

iHany · August 31, 2023, 6:48am

Did you try something like
Flux.Losses.mse(model(x), y; agg=skipmissing|>mean)
?

DoktorMike · August 31, 2023, 6:49am

No but I tried this:

loss(y, ŷ) = sum(skipmissing(y - ŷ) .^ 2)

which did not work. I will try your suggestion.

iHany · August 31, 2023, 6:50am

I think they are basically the same.
Sorry for inconvenience, I just cannot try this right now
But yeah, give it a try if you don’t mind :>

CarloLucibello · August 31, 2023, 7:00am

You could also try something with coalesce, like

loss(ŷ, y) = mse(coalesce.(y, ŷ),  ŷ)

DoktorMike · August 31, 2023, 7:20am

So this was precisely what I hoped to achieve with skipmissing. Did not know about coalesce. Very neat. This solves my problem. Thanks guys.

Topic		Replies	Views
Sparse loss with Flux Machine Learning flux	3	95	December 9, 2024
Tell Flux not to differentiate something Machine Learning flux	8	730	March 2, 2021
Flux/DiffEqFlux: error when a loss function does not use all the weights of the NN Machine Learning diffeq , flux	8	601	June 3, 2021
Loss function with derivative of output New to Julia question , flux , machine-learning , reversediff , neural-network	3	1603	September 30, 2021
Issue with custom Flux loss function General Usage flux	0	232	December 7, 2022

How can I differentiate a subset of the outputs of a neural network in Flux or Lux?

Related topics