I’m trying to run a straightforward regression problem through Flux. I’ve modified the loss function from the example, from a simple mse to something like a chi-square. The problem has to do with the calibration of a calorimeter, and the error term of the chi-square is proportional to the sqrt of the energy.
The following code exemplifies the problem:
using Flux model = Chain( Dense(250, 64), Dense(64, 1) ) x = rand(250, 1000) y = rand(1000) function loss(x, y) error = sqrt.(y) sqrt(sum(((model(x) .- y).^2)./error)) end @time Flux.Tracker.gradient(()->loss(x, y), params(model))
This code results in the following output on my machine.
15.787468 seconds (211.02 M allocations: 5.623 GiB, 4.70% gc time)
Removing the error term from the loss function gives me:
0.117279 seconds (15.27 k allocations: 59.252 MiB, 14.24% gc time)
In my actual example (where my input vector is of size 200k), I just can’t train with the loss function with the error term at all, but without it, my training is suboptimal, and I think I need this term.
Does anybody have a suggestion for how to get this loss function to work?