I’m not really sure I understand the training objective (sounds like a good candidate for using libraries from one of Julia’s PPL ecosystems), but if what you need is literally Dense with a diagonal weight matrix, we have Flux.Scale.
1 Like