I use Turing for MCMC on a simple DAE, and the computation time is in the order of 4.6 hours with apparently extreme memory allocation.

My dynamic model has 3 differential equations and 3 linear algebraic equations – with real experimental data (i.e., I don’t know how good the model is – it seems to be decent). I measure 2 of the 3 differential variables, and 2 of the 3 algebraic variables.

By eliminating the 3 algebraic equations and implementing the model using `DifferentialEquations`

, the solution is found approximately twice as fast as when implementing the original DAE using `ModelingToolkit`

– possibly because MTK pushes 2 of the AEs to `observed`

but keeps one AE – and hence uses a solver with mass matrix.

I then construct a Turing model, and attempt to solve the MCMC problem. I use `1_000`

burn-in iterations, followed by `2_000`

iterations from the posterior distribution, and the results appear to be ok.

*HOWEVER*, it takes some 4.6 *hours* to find the solution with 3 chains on 3 threads. Using the `@time`

macro:

the results are

So – 193 G allocations: 12 TiB, 63% gc time…

If I run the identical code in one cell in a Jupyter Notebook of VSCode, the allocation is similar, but seems to take twice the time (I restarted the computer between these runs).

**Question**: any thoughts on why the allocation is so horribly big?

- I have checked the
`eltype`

of simulation results (`DifferentialEquations`

), and they are`Float64`

. - I have checked the
`eltype`

of what is computed in the`Turing`

model, and these seem to be either:

–`Float64`

, or

–`ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 7}`

The Turing model looks as follows:

```
@model function fit_thermal(y_data, prob, t_data)
# Measurement variance priors
V_Ts_d ~ d_V_Ts_d
V_TFe_d ~ d_V_TFe_d
V_Tac_d ~ d_V_Tac_d
V_Tah_d ~ d_V_Tah_d
#
# Initial differential variable priors
Tr0 ~ d_Tr
Ts0 ~ d_Ts
TFe0 ~ d_TFe
#
# Heat transfer priors
UAr2d ~ d_UAr2d
UAs2Fe ~ d_UAs2Fe
UAFe2a ~ d_UAFe2a
UAx ~ d_UAx
#
# Heat source priors
QdFes ~ d_QdFes
Qdfs ~ d_Qdfs
u0_rand = [Tr0, Ts0, TFe0]
p_rand = [UAr2d, UAs2Fe, UAFe2a, UAx, QdFes, Qdfs]
#
prob_turing = remake(prob, u0=u0_rand, p=p_rand)
sol = solve(prob_turing)
# Output data model
Nd = length(t_data)
for i = 1:Nd
Ts = sol(t_data[i])[2]
TFe = sol(t_data[i])[3]
Tah, Tac = thermal_genODE_outputs([sol(t_data[i])], p, t_data[i])
y_data[:, i] ~ MvNormal(vec([Ts TFe Tac Tah]), [V_Ts_d, V_TFe_d, V_Tac_d, V_Tah_d])
end
end
```

Here, the distributions of the priors have been defined *outside* of the function, i.e., in the global scope. If I declare them in the Turing model (as done in Turing examples), they are declared in the local scope (advantage?), but have to be re-created every time the Turing model is called.

**Question**: Is there a syntax for re-using the memory location for some of the quantities within the Turing model?

- I could pass work arrays, etc. as input arguments in the function, and modify these. At least for arrays of fixed size
- I don’t know what the Turing macro does “behind the scene”, so it is not clear to me how I can re-use memory without allocating new memory each time the Turing model function is called.