DiffOpt / JuMP: zero gradient for variable fixed by equality constraint

yeomoon · November 22, 2025, 6:20am

Hi all,

I’m trying to train a neural network that is followed by a differentiable optimization layer in Julia. Conceptually it’s:

Flux NN → Economic Dispatch layer (ED layer) → Loss

The ED layer is implemented as a JuMP optimization problem (continuous Unit Commitment relaxation), and I wrote a custom rrule so that gradients can flow back to the NN outputs.

However, I’m running into an issue where gradients either do not flow at all (all zeros) or the rrule does not seem to be called as I expect. After some debugging, I suspect the problem is related to a variable that is fixed via an equality constraint.

I’d really appreciate help checking:

Whether this modeling pattern is fundamentally incompatible with DiffOpt’s reverse-mode sensitivity, and
How I should remodel the problem to get meaningful gradients.

Setup (simplified)

In my ED layer, I have a neural network output u_pred[i,t] (continuous relaxation of on/off decisions). Inside the JuMP model, I introduce a variable U[i,t] and then fix it using an equality constraint to u_pred[i,t]:

# U is declared as a variable and then fixed via equality constraints

@variable(diff_model, U[i=1:layer.Ngen, t=1:layer.Nhour])

@constraint(diff_model, [i=1:layer.Ngen, t=1:layer.Nhour], U[i,t] == u_pred[i,t])

# Later, U is used in operational constraints, e.g., capacity limits:
@constraint(diff_model, Pg[i,t] <= PgM[i] * U[i,t])

I’m using DiffOpt.jl in reverse mode to get sensitivities and then pass those back through a custom rrule so that gradients can flow to the NN parameters.

Observed problem

Conceptually, I want to get the gradient of the optimal objective (or of some function of the optimal solution) with respect to u_pred[i,t]. Since U[i,t] is constrained to equal u_pred[i,t], I was expecting that:
• \frac{\partial \text{obj}}{\partial \text{u_pred}} would be linked to
• \frac{\partial \text{obj}}{\partial U} as given by DiffOpt.ReverseVariablePrimal() (or similar).

But what I actually see is:
• U[i,t] == u_pred[i,t] fixes U via equality constraints.
• When I query:

MOI.get(diff_model, DiffOpt.ReverseVariablePrimal(), U[i,t])

I consistently get 0.0 for all i,t.

So it looks like DiffOpt is not giving any nontrivial gradient for a variable that is fully fixed by equality constraints.

Questions

Is it expected that DiffOpt.ReverseVariablePrimal() returns zero for variables that are fully fixed by equality constraints? In other words, from an MOI / DiffOpt perspective, is a variable that’s fixed by U[i,t] == constant “non-differentiable” w.r.t. the constant right-hand side?
What is the correct way to model this if I want gradients w.r.t. u_pred?

Topic		Replies	Views
NL Model Issue during forward diff Optimization (Mathematical) jump	1	440	May 21, 2020
@constraint with ForwardDiff.gradient Optimization (Mathematical) jump , forwarddiff	3	491	October 16, 2023
Derivative of variable with respect to constraint? Optimization (Mathematical) jump	5	1696	May 10, 2017
ForwardDiff evaluation of objective in Optim returns wrong result (objective uses DifferentialEquations pkg) Optimization (Mathematical) diffeq , optim , optimization , forwarddiff	7	836	March 10, 2021
JuMP interface access to automatic differentiation (ReverseDiffSparse) Optimization (Mathematical) jump , differentiation	5	2698	February 15, 2018

DiffOpt / JuMP: zero gradient for variable fixed by equality constraint

Setup (simplified)

Observed problem

Questions

Thanks for the help!

Related topics