Problem with Zygote gradient

mottelet · December 2, 2021, 11:32am

Hi all,

I work with Julia 1.7. Zygote fails to generate/compute my AD gradient in the code below (the loss function itself works well). The message is the following :

ERROR: LoadError: Need an adjoint for constructor DiffEqArray{Float64, 2, Vector{Vector{Float64}}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Vector{Float64}}. Gradient is of type VectorOfArray{Float64, 2, Vector{Vector{Float64}}}

I guess this has something to do with the selection of states in the loss function, but since the loss function works OK there should be another way of doing things differently ?

Thanks for help,

S.

function syst_AM2(state,theta,t) 
    Y_x1,Y_x2,mu1_max,mu2_max,Ks1,Ks2,KI,mu0 = theta;
    X1,X2,S0,S1,S2,CH4 = state
    r0=mu0*S0;
    r1=mu1_max*S1*X1/(S1+Ks1);
    r2=mu2_max*S2*X2/(S2+Ks2+(S2^2/KI));
    [Y_x1*r1, Y_x2*r2, -r0, -r1+r0, (1-Y_x1)*r1-r2, (1-Y_x2)*r2]
end

function loss(theta)
    prob = ODEProblem(syst_AM2,state0,(t_CH4[1],t_CH4[end]),theta)
    sol = solve(prob, Rodas5())
    state = sol(t_CH4)
    S2 = state[5,:]
    CH4 = state[6,:]
    deb_CH4 = CH4[2:end]-CH4[1:end-1]
    e1 = CH4[1:end-1]-Exp_CH4
    e2 = deb_CH4-Exp_deb_CH4
    e3 = S2[1:end-1]-Exp_AGV
    sum(e1.^2)+sum(e2.^2)+sum(e3.^2)
end

using Plots, DifferentialEquations, DiffEqSensitivity, Zygote

H=0.39;   
R=0.195; 
Surf=pi*R^2; 
Exp_CH4 = [0,0.146695751,0.452562719,0.85245185,1.390508363,1.88574014,2.333176518,2.79329662,3.417909766,4.299654989,6.515952412,10.29002283,13.99900971,16.46767139,18.77271153,21.23159231,23.82660177,26.81067853,30.00135498,32.59923757,34.18509462,35.5158513,36.74890559,37.86754592,38.79872368,39.69511074,40.56481051,41.3865614,42.13182338,42.84323204,43.49045536];

t_CH4=0.0:size(Exp_CH4)[1]; # temps (jours)
Exp_AGV = [0.6307493,1.8081079,3.0002554,3.4371393,4.1710245,4.306228,4.4414315,4.576635,5.0225553,5.2529314,4.6852442,3.6354049,3.1327115,2.6300181,2.1273247,1.9948686,1.8574543,1.5407059,1.2099629,1.1136956,1.0174283,0.9211824,0.9121245,0.9195297,0.8298519,0.9236508,0.9041827,0.8846932,0.8652251,0.8372571,0.808326];

Exp_deb_CH4 = 0*Exp_CH4;
Exp_deb_CH4[2:end]= Exp_CH4[2:end]-Exp_CH4[1:end-1];

state0=[0.283861214/(H*Surf), 0.094620405/(H*Surf), 5.180333444/(H*Surf), 0/(H*Surf), 0.029385961/(H*Surf), 0]; 
theta_c0 = [0.25, 0.43, 0.5, 3.1, 21, 23, 4, 0.08]


loss(theta_c0)
Zygote.gradient(loss,theta_c0)

ChrisRackauckas · December 2, 2021, 11:42am

That’s not the error it should be throwing . I’ll look into that. What it should be telling you is that sol(t_CH4) post-solution interpolation is not compatible with adjoint techniques. The solution here is… well, you don’t need to do that. Just do state = solve(prob, Rodas5(),saveat=t_CH4) and you’re good.

mottelet · December 2, 2021, 11:50am

Hi Chris,

Thank you. What package would you use then to fit the data (now that the loss function is OK for AD) ?

S.

ChrisRackauckas · December 2, 2021, 11:52am

Depends on what you’re doing, but DiffEqFlux is the package with all of the fitting utilities.

https://diffeqflux.sciml.ai/dev/

I generally recommend using the multiple shooting tooling.

https://diffeqflux.sciml.ai/dev/examples/multiple_shooting/

mottelet · December 2, 2021, 11:57am

OK I will experiment DiffEqFlux. Btw does the “multiple shooting” has something to do with the multiple shooting method for solving direct+adjoint system in optimal control ?

S.

ChrisRackauckas · December 2, 2021, 12:07pm

Yes, it’s the same thing. Optimal control is just a parameter estimation / inverse problem, so it’s the same deal. The “adjoint system” is the same thing as how reverse-mode AD works: it generates an adjoint system code and then solves it for the gradient.

Here’s optimal control in the docs:

https://diffeqflux.sciml.ai/dev/examples/optimal_control/

mottelet · December 2, 2021, 12:57pm

Ok thanks. But I have another problem: bounds are not honored by sciml_train, updated script follows and error is thrown at the end of optimization:

4.455452613111223
4.454489350679541
4.453604660656758
4.452731382256159
4.452731382256159
ERROR: LoadError: ArgumentError: Initial x[(8,)]=0.07975926992249756 is outside of [0.04, 0.04]

S.

function syst_AM2(state,theta,t) 
    Y_x1,Y_x2,mu1_max,mu2_max,Ks1,Ks2,KI,mu0,S00,S10,S20,coeff = theta;
    X1,X2,S0,S1,S2,CH4 = state
    r0=mu0*S0;
    r1=mu1_max*S1*X1/(S1+Ks1);
    r2=mu2_max*S2*X2/(S2+Ks2+(S2^2/KI));
    [Y_x1*r1, Y_x2*r2, -r0, -r1+r0, (1-Y_x1)*r1-r2, (1-Y_x2)*r2]
end

function loss(theta)
    state0 = [X10;X20;theta[9:11];0];
    coeff = theta[12];
    prob = ODEProblem(syst_AM2,state0,(t_CH4[1],t_CH4[end]),theta)
    state = solve(prob, Rodas5(),saveat=t_CH4)
    S2 = state[5,:]
    CH4 = state[6,:]
    deb_CH4 = CH4[2:end]-CH4[1:end-1]
    e1 = CH4[1:end-1]-Exp_CH4*1
    e2 = (deb_CH4-Exp_deb_CH4)*4
    e3 = (S2[1:end-1]-coeff*Exp_AGV)*1
    (sum(e1.^2)+sum(e2.^2)+sum(e3.^2))/length(t_CH4);
end

function mycallback(p, l, pred)
    display(l)
    return false
end

using Plots, DifferentialEquations, DiffEqSensitivity, Zygote,  DiffEqFlux

H=0.39;   
R=0.195; 
Surf=pi*R^2; 
Exp_CH4 = [0,0.146695751,0.452562719,0.85245185,1.390508363,1.88574014,2.333176518,2.79329662,3.417909766,4.299654989,6.515952412,10.29002283,13.99900971,16.46767139,18.77271153,21.23159231,23.82660177,26.81067853,30.00135498,32.59923757,34.18509462,35.5158513,36.74890559,37.86754592,38.79872368,39.69511074,40.56481051,41.3865614,42.13182338,42.84323204,43.49045536];

t_CH4=0.0:size(Exp_CH4)[1]; # temps (jours)
Exp_AGV = [0.6307493,1.8081079,3.0002554,3.4371393,4.1710245,4.306228,4.4414315,4.576635,5.0225553,5.2529314,4.6852442,3.6354049,3.1327115,2.6300181,2.1273247,1.9948686,1.8574543,1.5407059,1.2099629,1.1136956,1.0174283,0.9211824,0.9121245,0.9195297,0.8298519,0.9236508,0.9041827,0.8846932,0.8652251,0.8372571,0.808326];

Exp_deb_CH4 = 0*Exp_CH4;
Exp_deb_CH4[2:end]= Exp_CH4[2:end]-Exp_CH4[1:end-1];

X10 = 0.283861214/(H*Surf);   
X20 = 0.094620405/(H*Surf);
S00 = 5.180333444/(H*Surf);
S10 = 0/(H*Surf); 
S20 = 0.029385961/(H*Surf); 

theta_c0 = [0.25, 0.43, 0.5, 3.1, 21.0, 23.0, 4.0, 0.04, S00, S10, S20, 1.0]

cstr = [
0 1;    #Y_X1
0 1;    #Y_X2
0 Inf; #mu1max
0 Inf; #mu2max
1e-3 Inf; #Ks1
1e-3 Inf; #Ks2
1e-3 Inf; #KI
0.04  0.04; # mu0
0  Inf; # S00
0  Inf; # S10
0  Inf; # S20
0  Inf]; # AGV coefficient


result = DiffEqFlux.sciml_train(loss, theta_c0, cb=mycallback, lower_bounds=cstr[:,1],upper_bounds=cstr[:,2])

ChrisRackauckas · December 2, 2021, 1:01pm

Which optimizer? Flux optimizers will ignore them.

mottelet · December 2, 2021, 1:07pm

I tried to use directly GalacticOptim,jl I replaced

result = DiffEqFlux.sciml_train(loss, theta_c0, cb=mycallback, lower_bounds=cstr[:,1],upper_bounds=cstr[:,2])

by

fun = OptimizationFunction(loss, GalacticOptim.AutoForwardDiff());
prob = GalacticOptim.OptimizationProblem(fun, theta_c0,lcons = cstr[:,1], ucons = cstr[:,2]);
sol = solve(prob, IPNewton())

but I have the following error:

ERROR: LoadError: Use OptimizationFunction to pass the derivatives or automatically generate them with one of the autodiff backends

S.

ChrisRackauckas · December 16, 2021, 12:34pm

IIRC This got an issue and was solved?

mottelet · December 16, 2021, 12:41pm

yes !.

S.

Topic		Replies	Views
Is this impossible for Zygote Adjoint to handle General Usage zygote	4	589	April 7, 2021
Using Flux: gradient on DifferentialEquations: solve results in an error New to Julia question , diffeq , differentiation	11	990	May 8, 2023
sciML_train func running but not optimizing New to Julia diffeq , flux , optimization , sciml	5	537	April 14, 2021
Questionable Zygote gradients for quantum optimal control problem Optimization (Mathematical) question , ode , optimization , zygote	6	366	March 26, 2024
Cannot compute gradient/apply back method to solve function General Usage differentiation , zygote	3	459	November 15, 2020

Problem with Zygote gradient

Related topics