Flux.jl Inconsistent Training on Custom Architecture

Please see Please read: make it easier to help you. We can try to work out the issue by speculating about it, but the best way to get a resolution is to post a MWE that demonstrates it. For Flux stuff, that includes dummy data, library versions used and a full stacktrace of any relevant errors.