Flux assumes only trainable parameters should be moved to GPU

I think you can specify which parameters are trainable separately from which parameters are the layers “children”. Eg for BatchNorms: https://github.com/FluxML/Flux.jl/blob/29a96b961badea84a3bb323257c1374ffebca2e7/src/layers/normalise.jl#L265.

3 Likes