Flux chain type unstable when broadcasting inside gradient

filchristou · June 17, 2024, 9:27am

I’ve been fighting for a couple of days to get my Flux/Zygote autodiff code type stable.
I am not sure what’s the problem, but coming up with a MWE, it looks like broadcasting Flux.Chain is problematic ?

using Flux, Zygote
import Statistics: mean

function internfunc_nobroad(m, x, y)
    modelvals = m(x)
    Flux.mse(modelvals, y)
end

function internfunc_broad(m, x, y)
    modelvals = m.(x)
    mses = Flux.mse.(modelvals, y)
    return mean(mses)
end

function wrapfunc(model, xdata, ydata, func)
    grad = let xdata=xdata, ydata=ydata
        Zygote.gradient(m -> func(m, xdata, ydata), model)
    end
    return grad
end

Run the following in REPL

julia> fc = Flux.Chain(Flux.Dense(5=>3, Flux.relu), Flux.Dense(3=>3, Flux.relu), Flux.Dense(3=>1))
julia> fx = [fill(5f0, 5) for _ in 1:10]
julia> fy = fill(2f0, 10)

julia> @code_warntype wrapfunc(fc, fx, fy, internfunc_broad) # type unstable

julia> @code_warntype wrapfunc(fc, fx[1], fy[1], internfunc_nobroad) # type stable

I made a similar issue in Flux.jl

filchristou · June 17, 2024, 11:57am

Okey, I think I got it… I should convert the input to a matrix and not a Vector of Vectors. Then, Flux handles that nicely.

fobs_ar = fill(5f0, 5, 10)
labels_ar = fill(2f0, 1, 10)

@code_warntype wrapfunc(fc, fobs_ar, labels_ar, internfunc_nobroad)

filchristou · June 17, 2024, 2:19pm

well… After switching from Flux.mse to Flux.huber_loss I get type unstable code again…

function internfunc_nobroad_huberloss(m, x, y)
    modelvals = m(x)
    Flux.huber_loss(modelvals, y)
end

@code_warntype wrapfunc(fc, fobs_ar, labels_ar, internfunc_nobroad)

This looks definitely like a bug.
I made an issue. Feel free to drop some hints if you know why is that and how could it be tackled.

Topic		Replies	Views
Flux Dense Layer Type Instability Machine Learning	7	373	May 11, 2023
Help me pin this bug in `Flux`… Machine Learning bug , error , flux , zygote , autodiff	2	443	May 31, 2022
Type stability with Flux gradient of loss function requiring parameters Machine Learning	4	477	February 8, 2023
Error This intrinsic must be compiled to be called. Flux/Zygote Optimization (Mathematical) flux , zygote , cuarrays	5	1422	August 28, 2021
Type-stable function with Flux Chain and Dense Machine Learning	3	101	May 4, 2025

Flux chain type unstable when broadcasting inside gradient

Related topics