I’m trying to estimate initial values and parameters in a DAE using Turing. The structure of the DAE is as follows:
I have available known inputs u(t) and measured outputs y where, in general,
The unknowns (“descriptor” in DAE speak) are x,z , or could also be denoted differential variables (x; 3 temperatures) and algebraic variables (z; 3 other temperatures). The DAE is index 1, which means that the system has 3 states. So I need 3 initial values to solve the DAE. One could think of the x variables as states.
The following Julia code works for simulating the DAE:
function system!(dx,x,p,t) Tr,Ts,TFe,Tac,Tad,Tah = x mr,ms,mFe = 9260.,6827.,71200. Rr,Rs = 0.127e-3,1.95e-6 chpCu,chpFe,chpa,chpw = 0.385,0.465,1.15,4.18 mda,mdw = 49.2,54.2 QdFes,Qdfs = 212.,422.4 UAr2d,UAs2Fe,UAFe2a,UAx,QdFes,Qdfs = p Nstw = UAx/chpw/mdw Nsta = UAx/chpa/mda Nstd = Nstw - Nsta # equations dx= (1.1*Rr*If(t)^2-UAr2d*(Tr-Tad))/(mr*chpCu) dx= (3*Rs*It(t)^2-UAs2Fe*(Ts-TFe))/(ms*chpCu) dx= (UAs2Fe*(Ts-TFe)-UAFe2a*(TFe-Tah)+QdFes)/(mFe*chpFe) dx = mda*chpa*(Tac-Tad)+UAr2d*(Tr-Tad)+Qdfs dx = mda*chpa*(Tad-Tah)+UAFe2a*(TFe-Tah) dx = Tac*(Nstw-Nsta*exp(-Nstd))-Nstd*Tah-Nsta*(1-exp(-Nstd))*Twc(t) end
To set up the model and solve it, I do as follows:
x0 = [28., 28., 28., 14.,18.,22.] tspan = (0.0,583*60.) UAr2d,UAs2Fe,UAFe2a,UAx,QdFes,Qdfs = 2.7,20.,14.3,44.4,212.,422.4 p = [UAr2d,UAs2Fe,UAFe2a,UAx,QdFes,Qdfs] probfunc = ODEFunction(system!, mass_matrix=Diagonal([1.,1.,1.,0.,0.,0.])) # DAEs to ODEs using Mass matrix reduction technique prob = ODEProblem(probfunc, x0, tspan, p) sol = solve(prob,saveat=60.)
This set-up works, and produces the solution.
Question 1: One thing I’m not 100% sure of is: what happens if there is inconsistency between the initail values of x and z (i.e.,
x in the Julia code)? Will the solver make adjustments so that the algebraic equations are satisfied at every time-step, including at initial time?
For data fitting, I have available outputs y consisting of 2 temperatures out of 3 from x (differential), and 2 temperatures out of 3 from z (algebraic).
Question 2: Is the above DAE implementation suitable for AD computations with Turing?
I’m assuming that all initial descriptor variables are unknown, i.e., I specify a prior distribution for every element in the descriptor. Of course, if x is known, then z is known from the algebraic equations, but I’m assuming it is possible to postulate uncertainty in both x and z…
Question 3: Does it make sense to postulate priors for both differential (x) and algebraic variables (z)? The answer to this question may depend on how the DAE solver handles satisfying the algebraic equations, I guess?
Here is how I set up the Turing model:
Turing.setadbackend(:forwarddiff) # Automatic Differentiation tool for Jacobian and Hessian # @model function fit_system(data,prob,tdata) # Measurement variance prior V ~ InverseGamma(2,3) # measurement variance https://en.wikipedia.org/wiki/Inverse-gamma_distribution # Initial differential variable priors Tr0 ~ truncated(Normal(30.,3),25,35) # Initial value prior Ts0 ~truncated(Normal(30.,3),25,35) # "" TFe0 ~truncated(Normal(30.,2),25,35) # "" # Initial algebraic variable priors Tac0 ~ truncated(Normal(12.,3),1.,30) # "" Tad0 ~truncated(Normal(18.,5),1,30) # "" Tah0 ~truncated(Normal(22.,3),1,60) # "" # Parameter priors UAr2d ~ truncated(Normal(2.7,1),0.2,5) # heat transfer prior UAs2Fe ~ truncated(Normal(20,5),1,50) # "" UAFe2a ~ truncated(Normal(15,2),0.5,40) # "" UAx ~ truncated(Normal(44,5),1,100) # "" QdFes ~ truncated(Normal(212.,40),20,400) # "" Qdfs ~ truncated(Normal(422.,20),200,500) # "" # Initial descriptor x0 and parameter p x0 = [Tr0,Ts0,TFe0,Tac0,Tad0,Tah0] p = [UAr2d,UAs2Fe,UAFe2a,UAx,QdFes,Qdfs] # Problem for differential equation -> solution prob_turing = remake(prob,u0=x0,p=p) # u0 and p should be of same type for autodiff sol = solve(prob_turing,saveat=60.,save_idxs=[2,3,4,6]) # Output data model for i = 1:Nd data[:,i] ~ MvNormal(sol(tdata[i]), V*I) end end # model = fit_system(data,prob,tdata); # model
Observe that I assume the same data uncertainty (variance
V) in all 4 measured variables.
Question 4: Does this make sense? Perhaps… since all measurements are temperatures of similar size. But in general, this is a problem, I guess (unless data are normalized).
Finally, I run the data fitting:
Ni = 100 p_est = sample(model,NUTS(0.65),Ni)
Question 5: I’ve seen
Ni referred to as the number of iterations. Is this the same as the number of outcomes/realizations in the posterior distribution?
I’ve also tried to run the code with multiple chains, either:
Nc = 3 # number of chains p_est = sample(model,NUTS(.65),MCMCThreads(),Ni,Nc)
which seems to run… but forever… so I just have to stop it, or
p_est = sample(model,NUTS(.65),Ni,Nc)
which simply crashes…
Question 6: I assume that when I use multiple chains, this is just running the same sampling for
Nc chains independently… and that the purpose is to see if I get the same posterior distributions for all chains?? Sort of to check that the found posterior distributions are representative of the true distributions?
Since MCMC is based on Monte Carlo methods, is should be possible to run samples in parallel – or – at least, chains in parallel. I assume that is what
MCMCThreads() does. I have issued command
Threads.nthreads() in Jupyter, and the answer is
Question 7: Does this mean that I have
8 cores available in the current session? Or does it mean that my computer has
8 cores in total, and that I need to specify the number of cores I want to use when I start up Julia?
OK – sorry for many questions… when I run a single chain, typical results are that the posterior distributions are suspiciously narrow:
Although my code runs (for a single chain), it is relatively slow (which is a relative concept). And if I re-run the notebook, many times I don’t get results at all.