Loss function with scalable constraints

Assume a loss function of the firm L1 + a * L2 where L1 and L2 depend on model parameters. I wish to adjust the coefficient ‘a’ so that the two terms have equal magnitude. However, I do not want the coefficient to be differentiated by Zygote. How is this achieved? Thanks.

Sounds like you may want to solve a maximin problem, i.e. you want \min_p \{ \ell_1(p), \ell_2(p) \} so that the two terms are typically balanced at the minimum?

If so, this can be turned into a differentiable NLP with an “epigraph” transformation: min_{p,t} t subject to t \ge \ell_1(p),\;t\ge \ell_2(p).

Thanks. I am minimizing an AL2 norm subject to a sparsity inducing L1 norm to induce sparsity. Since I do not know the optimum value of the Lagrange multiplier, I wanted to try equating the magnitude of the two norms.