MATLAB -> Julia, Diffeq cannot solve large system over long time-span

atarias1 · January 22, 2024, 3:12pm

Hello,

I am working on a project to model dehydration of some rocks. I am planning on adapting this model https://github.com/rmskarbek/PorosityWaves to different conditions. However, before moving on to that I wanted to see if I could replicate the results of the original MATLAB code in Julia using DiffEq.

The original MATLAB model uses ode15s as the solver and takes around 1-2 hours solving for a timespan of 120,000 years.

I rewrote the model in Julia to the best of my ability, but I am unable to solve the system in a reasonable amount of time for the original timespan of 0-120,000 years. I have been able to solve it over the timespan of 0-12,000 years fairly rapidly using FBDF. But the moment I increase the tspan to 120,000 years the solver progress slows to a crawl and I haven’t had the patience to wait for it to complete. I have even tried running it over night without it finishing, but ideally it should be faster than the MATLAB implementation.

Some reports using @btime to show how the evaluation time behaves with increasing tspan:

tspan = (0, 120000) → Never Finishes
tspan = (0, 12000) → 25.844 s (484132372 allocations: 23.47 GiB)
tspan = (0, 1200) → 9.151 s (108983605 allocations: 9.39 GiB)
tspan = (0, 120) → 6.795 s (59317085 allocations: 7.22 GiB)
tspan = (0, 12) → 5.266 s (38895461 allocations: 5.89 GiB)

Also, at first I tried using QBDF and QNDF as solvers, but they both suggest it will take ~2e12 hours to complete on the tspan of 0-12,000 years. I am not sure why this is the case.

Here is the code:

using DifferentialEquations
using ProgressLogging, SparseArrays, Statistics, Plots, Polynomials, IfElse, BenchmarkTools

# DeltG of Rxn.
function DeltaG(T, P_t)
	## For Gibbs Calculation 

    a0 = 65945.7007021266
    a1 = -415.847466841169
    a2 = 0.654522687965514
    dT = 58
    a0 = a0 - a1*dT - 2*a2*dT + a2*dT^2    
    a1 = a1-2*a2*dT                        
    g0 = -643.932375104237

    return @. -g0*((1/(2*a2))*(-a1 + sqrt(a1^2-4*a2*(a0-(P_t/1e5))))-(T-273))
end

## Initial Conditions
function initial(phi_i, N, depth, depth_0, rho_s, rho_f) 

    # Plate Interface
    plateInt = [0.2709, -9.8967e3]
    distance_0 = (depth_0 - plateInt[2])/plateInt[1]

    # Geometry 
    betaT = atan(plateInt[1])
    v_plate = 0.037 /  (365*24*3600)
    v_x = v_plate * cos(betaT) # horizontal velocity 
    stressFit = [8693.11664571161, -365056544.708814]

    dsigmadt = stressFit[1] * cos(betaT) * v_plate

    #Temperature
    tempDepth = [200/30000, (400*50000 - 600*20000)/30000]
    dTdz = 500 / 25000 # geotherm
    dTempdt = tempDepth[1] * plateInt[1] * v_x
    const_vars = tempDepth[1] * (plateInt[1] * distance_0 + plateInt[2]) + tempDepth[2] + 273 
    Temp =  dTdz .* depth 
    Temp = Temp .+ const_vars

    # Stress overburden - polynomial coefficients are reverse in julia compared to matlab polyval
    stressPoly = Polynomial([stressFit[2], stressFit[1]])
    overburden = stressPoly(distance_0)

    # porosity 
    porosity = phi_i .* ones(N)

    # Stress in column 
    lithostatic = zeros(N)
    bulkDensity = @. rho_s - (rho_s - rho_f)*porosity
    lithostatic[1] = 9.81 * (depth[1])* bulkDensity[1]
    for i in range(2,N, step = 1)

        lithostatic[i] = 9.81 * (depth[i] - depth[i-1]) * bulkDensity[i] + lithostatic[i - 1]
    end

    hydrostatic = @. 9.81 * rho_f * (depth + depth_0)
    # Total pore pressure
    pressure_0 = (overburden * ones(N)) .+ lithostatic
    # Excess Pore Pressure above hydrostatic (flow pressure)
    pressure_i = pressure_0 .- hydrostatic

    # Calc overpressure gradient
    dp_lithdz = 9.81*(bulkDensity .- rho_f)
    dp_lithdz = mean(dp_lithdz)

    return pressure_i, pressure_0, hydrostatic, Temp, dTempdt, dsigmadt, dp_lithdz
end

## Diff Mats

function ddz(z0, zL, n, v)
    ## This is from MatLab MatMol group (2009) 
    ## Second-order directional differencing for MOL

    dz = (zL-z0)/(n-1)
    r2fdz = 1 / (2*dz)

    # Finite difference for positive v (indicates flow direction)
    if v > 0
    
        D = spdiagm(-2 => ones(n-2), -1 => -4*ones(n-1), 0 => 3 * ones(n))

        ## Boundaries 

        D[1, 1:3] = [-3, 4, -1]
        D[2, 1:3] = [-1, 0, 1]

    end

    if v < 0

        D = spdiagm(0 => -3*ones(n), 1 => 4*ones(n-1), 2 => -ones(n-2))

        ## Boundaries 

        D[n-1, (n-2):n] = [-1, 0, +1]
        D[n, (n-2):n] = [1, -4, 3]

    end

    D = r2fdz*D

    return D
end

function d2dz2(z0, zL, n)
    ## From Matlab MatMol group (2009)

    ## Four point second order approx. of the second derivative
    ## See the MatMol 2009 docs to see the derivation 

    dz = (zL-z0)/(n-1)
    rdzs = 1 / (2*dz)

    # interior
    D = spdiagm(-1 => ones(n-1), 0 => -2*ones(n), 1 => ones(n-1))

    # boundaries 
    D[1, 1:4] = [2, -5, 4, -1]
    D[n, (n-3):n] = [-1, 4, -5, 2]

    D = rdzs*D

    return D
end

## Sparsity Pattern
function sparsity_pattern(N)
    iD = 1:N
    iSup = 1:N-1
    iSupSup = 1:N-2

    # Pressure section
    P_p = sparse(iD, iD, ones(N)) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSup.+1, iSup, ones(N-1), N, N) +
        sparse(iSupSup, iSupSup .+ 2, ones(N-2), N, N)
    P_p[N, N-2] = 1
    P_p[N, N-3] = 1

    P_T = sparse(iD, iD, ones(N))
    P_phi = sparse(iD, iD, ones(N)) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSupSup, iSupSup .+ 2, ones(N-2), N, N)
    P_phi[N-1, N-2] = 1
    P_phi[N, N-1] = 1
    P_phi[N, N-2] = 1
    P_m = P_T

    Jac_P = sparse_hcat(P_p, P_T, P_phi, P_m)

    # Temperature equation
    T_p = sparse(iD, iD, [ones(N-1); 0]) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSup.+1, iSup, [ones(N-2); 0], N, N)
    T_T = T_p
    T_phi = P_phi
    T_m = sparse(iD, iD, [ones(N-1); 0])
    Jac_T = sparse_hcat(T_p, T_T, T_phi, T_m)

    # Porosity section is equal to the pressure section
    Jac_m = sparse_hcat(P_T, P_T, spzeros(N,N), P_T)

    # Combine all the sections
    S = sparse_vcat(Jac_P, Jac_T, Jac_P, Jac_m)
    
    S = convert(SparseMatrixCSC{Bool, Int64}, S)
    return S
end

# Model
function dudt!(du, u, p, t)
	
    # parameters 

    (m_i, M_H2O, A, c0, hydro, n_r, E_a, R, T0, litho, doubleSnake, 
    ρ_f, η, k_i, c_m, μ, n, dsigmadt, P_init_b, dz, dp_lithdz, alpha, 
    K, G, K_f, C_s, K_T, L, dTempdt, ϕ_0, D1, DD2) = p

    # variables
    p_ex  = @view u[:, 1]
    T     = @view u[:, 2]
    ϕ     = @view u[:, 3]
    m_d   = @view u[:, 4]
	
    dpdz  =  D1 * p_ex
    dpdzz = DD2 * p_ex
    dϕdz  =  D1 * ϕ
    dTdzz = DD2 * T

    ### Dehydration
    #dM_d/dt 
    @. du[:,4] .= (1/m_i) * M_H2O * A * c0 * (m_i - m_d) * 
    (IfElse.ifelse(DeltaG(T, p_ex + hydro) < 0, abs(DeltaG(T, p_ex + hydro))^n_r, 0)) *
    exp((-E_a/R)*(1/T - 1/T0))
    
    # last 1000 nodes do not react
    du[end-1000:end, 4] .= 0

    ### Pressure
    p_eff = @. litho + dsigmadt*t - (p_ex + hydro) # Effective Pressure
    dPHIdt = @. (doubleSnake/ρ_f) * du[:,4] - (3/4)*ϕ* p_eff / η 
    #dP/dt
    @. du[begin:end-1000, 1] = (k_i * c_m/μ) * ((ϕ[begin:end-1000] / ϕ_0)^n) * ((n/ϕ[begin:end-1000]) * dϕdz[begin:end-1000] * dpdz[begin:end-1000] + dpdzz[begin:end-1000]) +
        c_m*(du[begin:end-1000,4]/ρ_f - dPHIdt[begin:end-1000]) + dsigmadt  
    
    @. du[end-1000+1:end, 1] =  (k_i * c_m/μ) * ((ϕ[end-1000+1:end] / ϕ_0)^n) * ((n/ϕ[end-1000+1:end]) * dϕdz[end-1000+1:end] * dpdz[end-1000+1:end] + dpdzz[end-1000+1:end]) + dsigmadt  

    # Upper absorbing boundary

    du[1,1] = @. (k_i*c_m/μ) * ((ϕ[1]/ϕ_0)^n) * (n/ϕ[1]) * dϕdz[1] * dpdz[1] + (c_m/η)*(3/4)*ϕ[1]*p_eff[1] + dsigmadt

    # bottom boundary 
    pN = @. P_init_b + dz * dp_lithdz + dsigmadt*t 
    dpdzzN = (p_ex[end-1] - 2*p_ex[end] + pN)/ dz^2
    
    du[end, 1] = @. (k_i*c_m/μ) * ((ϕ[end]/ ϕ_0)^n) * ((n/ϕ[end]) * dϕdz[end] * dpdz[end] + dpdzzN) + 
    c_m*(du[end,4]/ ρ_f - dPHIdt[end]) + dsigmadt

    ## Porosity
    #dϕ/dt
    @.du[1:end-1000,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    (1/c_m - ϕ[1:end-1000]/K_f)*du[1:end-1000,1] +
    dPHIdt[1:end-1000]

    @. du[end-1000+1:end,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    (1/c_m - ϕ[end-1000+1:end]/K_f)*du[end-1000+1:end,1]

    ## Temperature

    # conduction only

    @. du[:,2] .= (1/C_s) * (K_T * dTdzz - L*du[:, 4])

    # Upper boundary
    du[1,2] = -(L/C_s) * du[1,4]

    # Lower boundary constant
    du[end,2] = dTempdt 

    #done
	nothing

end

### Parameters
m_i = 320         # Initial bound water
C_s = 2.5e6       # volumetric heat capacity  
k_i = 1e-20       # background permeability
K_f = 2.9e9       # fluid bulk modulus
K_T = 2.25        # thermal conductivity
n   = 3           # k-ϕ powerlaw exponent
v_p = 7000        # p-wave velocity
v_s = v_p / 2.35  # s-wave velocity
α   = 0.2         # Biot parameter
η   = 1e15        # bulk viscosity
μ   = 2e-4        # fluid viscosity
ν   = 0.3894      # Poisson's ratio
ϕ_i = 0.02        # initial porosity
ϕ_0 = 0.015       # reference porosity 
ρ_f = 1e3         # fluid density
ρ_s = 3e3         # solid density

## Computed Parameters
ρ_b = (1-ϕ_i)*ρ_s + ϕ_i*ρ_f   # Bulk Density
        
G = ρ_b * v_s^2               # Shear modulus

K_u = (2*(1+ν)*G)/(3*(1-2*ν)) # Undrained bulk modulus

# Drained bulk modulus
K = (1/(2*ϕ_i))*((K_f+K_u)*ϕ_i - α*(1+ϕ_i)*K_f + 
(4*(α-1)*(ϕ_i-α)*ϕ_i*K_f*K_u + 
(ϕ_i*(K_f+K_u)-α*(1+ϕ_i)*K_f)^2)^(1/2)) 

# Elastic Parameter
c_m = ((K_u - K)*(K + 4*G/3))/(α^2 * (K_u + 4*G/3))

## Dehydration Parameters
A     = 201             # Specific surface area of RLM
c0    = 1.29e-20        # Ref. reaction constant
E_a   = 2e4             # Activation energy
L     = 500e3           # latent heat of Rxn.
M_H2O = 0.018           # H2O molar mass
n_r   = 3.64            # Rxn. order 
R     = 1.987           # gas constant
T0    = 633             # Reference Temperature
doubleSnake = 0.1       # Plastic compaction parameter

### Discritization
N = 10000 # dehydrating nodes
NE = Int(0.1 * N) # Non dehydrating nodes
dz = 1000/N  # distance between nodes in meters
N = 2*N + NE # Number of nodes in the column
depth = collect(dz:dz:dz*N)

### Set up problem
# Initial conditions
P_i, litho, hydro, T_i, dTdt, dsigmadt, dP_lithdz = initial(ϕ_i, N, depth, 28.345e3, ρ_s, ρ_f)
Phi_i = ϕ_i * ones(N)
M_di = zeros(N)

u0 = cat(P_i, T_i, Phi_i, M_di; dims = 2)
v = -1 # upwards flow
D1  = ddz(depth[1], depth[N], N, v)
DD2 = d2dz2(depth[1], depth[N], N)
# Parameters
p = (m_i, M_H2O, A, c0, hydro, n_r, E_a, R, 
    T0, litho, doubleSnake, ρ_f, η, k_i, c_m, 
    μ, n, dsigmadt, P_i[end], dz, dP_lithdz, α, K, 
    G, K_f, C_s, K_T, L, dTdt, ϕ_0, D1, DD2)
# Time -- need to be able to run for 120000 years
tspan = (0.0, 31536000 * 120000) 
# Build sparsity pattern
jac_sparsity = sparsity_pattern(N)
f = ODEFunction(dudt! ; jac_prototype = float.(jac_sparsity))
# Make problem
dehydrate_prob = ODEProblem(f, u0, tspan, p)
sol = solve(dehydrate_prob, FBDF(), saveat = tspan[2]/2000, save_everystep = false, abstol = 1e-12, reltol = 1e-10, progress = true, maxiters = Int(1e6))

I would appreciate any help on how I can make the larger time spans possible to solve for. I figure if I cannot make the original model solve over these timespans, then the model I will develop after this will be likely worse.

Thanks!

mike.ingold · January 22, 2024, 4:47pm

Just based on a quick look, here’s my first impressions:

Based on those @btime results, that’s a huge number of memory allocations. That tells me that probably the number one thing you can do to address performance is work on reducing that number. Memory access is fast, but memory management is glacially-slow relative to actual computation.
Looks like your dudt! function already makes some use of @view and broadcasting.
You may have some minor issues related to the data types of certain vectors defined in your code. In Matlab, all numbers are treated as Float64 by default, so there’s no functional difference between something like 100 and 100.0 . But in Julia these are Int vs Float64, which can have performance implications. Let’s say you define a vector like [100, 202.4]. Since you don’t specify a type, Julia will allocate this as a Vector{Real}, a vector of abstract type that holds pointer references to the location of the actual contained values elsewhere in memory, vs a Vector{Float64}, a vector of concrete type that holds the actual values in a block of memory.

Edit: I stand corrected on that example. Vector types probably aren’t an issue in this case, then, though the general point should stand that for some construction like [a, b] Julia will need to infer a common type for a and b, and there are unfortunate cases out there where you can unexpectedly wind up with a Vector{Any}, which has performance implications.

Oscar_Smith · January 22, 2024, 5:12pm

Let’s say you define a vector like [100, 202.4]. Since you don’t specify a type, Julia will allocate this as a Vector{Real}

This is incorrect. The array constructor promotes its inputs.

[100,202.4]
2-element Vector{Float64}:
 100.0
 202.4

atarias1 · January 22, 2024, 5:25pm

Hi thanks for the reply.

I realized I had a bit of a silly bug in two parts of my model:

@. du[:,4] .= (1/m_i) * M_H2O * A * c0 * (m_i - m_d) * 
    (IfElse.ifelse(DeltaG(T, p_ex + hydro) < 0, abs(DeltaG(T, p_ex + hydro))^n_r, 0)) *
    exp((-E_a/R)*(1/T - 1/T0))


@. du[:,2] .= (1/C_s) * (K_T * dTdzz - L*du[:, 4])

I think since I put @. at the beginning of the line and before =, it prevented broadcasting (Not exactly sure if that is correct but seems like it).
This gives:
@btime for tspan = (0, 12000) → 21.250 s (1488246 allocations: 16.25 GiB).

I will see if this solves the problem for the longer timespan. And continue to try to reduce allocations.

mike.ingold · January 22, 2024, 5:27pm

I didn’t have a terminal handy earlier but just checked. I stand corrected. Prior post edited. I can’t seem to generate an abstractly-typed vector of numbers in v1.10 without some contrivances. So, as long as you don’t do something bizarre like [100, "a string"] you’re probably fine there.

PeterSimon · January 22, 2024, 5:46pm

The du[:, 4] on the RHS allocates. Try

@. du[:,2] = (1/C_s) * (K_T * dTdzz - L*@view du[:, 4])

mike.ingold · January 22, 2024, 5:58pm

atarias1:

    ## Porosity
    #dϕ/dt
    @.du[1:end-1000,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    (1/c_m - ϕ[1:end-1000]/K_f)*du[1:end-1000,1] +
    dPHIdt[1:end-1000]

    @. du[end-1000+1:end,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    (1/c_m - ϕ[end-1000+1:end]/K_f)*du[end-1000+1:end,1]

These could use an application of @views to prevent allocations on the RHS also.

tshort · January 22, 2024, 5:58pm

The following statements also allocate. You could try mul! to pre-allocated buffers.

    dpdz  =  D1 * p_ex
    dpdzz = DD2 * p_ex
    dϕdz  =  D1 * ϕ
    dTdzz = DD2 * T

The calls to DeltaG also allocate.

Your vectorized assignments are complicated enough, that I’d turn them into loops. I think that makes them easier to follow.

Profiling is another way to identify problem expressions.

baggepinnen · January 22, 2024, 6:14pm

julia> Real[1, 1.0]
2-element Vector{Real}:
 1
 1.0

mike.ingold · January 22, 2024, 6:23pm

That’s kind of what I meant by contrivances. You can definitely manually-specify the array type, but I was looking more for ways this could come about unintentionally for a casual user.

JonasWickman · January 22, 2024, 7:22pm

You may also try your luck with the CVODE_BDF() solver from the Sundials.jl package, it’s quite similar to ode15s in Matlab.

ChrisRackauckas · January 22, 2024, 9:02pm

Not really. QNDF is closer to ode15s. FBDF is closer to CVODE_BDF

atarias1 · January 22, 2024, 9:37pm

Unfortunately, I was testing earlier last week with CVODE_BDF() and it had the same issue of becoming infinitely slow whenever I tried a longer timespan. Actually, as I am testing it now CVODE_BDF() will not solve, and reports the system is unstable. However, the result from FBDF() is producing expected behavior at the smaller time scales with a porosity wave train propagating as expected.

I’ve ended up rewriting the code with for loops to try to speed it up.

Here is the code now:

using DifferentialEquations, LinearAlgebra
using ProgressLogging, SparseArrays, Statistics, Plots, Polynomials, IfElse, BenchmarkTools
# DeltG of Rxn.

## Initial Conditions
function initial(phi_i, N, depth, depth_0, rho_s, rho_f) 

    # Plate Interface
    plateInt = [0.2709, -9.8967e3]
    distance_0 = (depth_0 - plateInt[2])/plateInt[1]

    # Geomet. Parameters
    betaT = atan(plateInt[1])
    v_plate = 0.037 /  (365*24*3600)
    v_x = v_plate * cos(betaT) # horizontal velocity 
    stressFit = [8693.11664571161, -365056544.708814]

    dsigmadt = stressFit[1] * cos(betaT) * v_plate

    #Temperature
    tempDepth = [200/30000, (400*50000 - 600*20000)/30000]
    dTdz = 500 / 25000 # geotherm
    dTempdt = tempDepth[1] * plateInt[1] * v_x
    const_vars = tempDepth[1] * (plateInt[1] * distance_0 + plateInt[2]) + tempDepth[2] + 273 
    Temp =  dTdz .* depth 
    Temp = Temp .+ const_vars

    # Stress overburden - polynomial coefficients are reverse in julia compared to matlab polyval
    stressPoly = Polynomial([stressFit[2], stressFit[1]])
    overburden = stressPoly(distance_0)

    # porosity 
    porosity = phi_i .* ones(N)

    # Stress in column 
    lithostatic = zeros(N)
    bulkDensity = @. rho_s - (rho_s - rho_f)*porosity
    lithostatic[1] = 9.81 * (depth[1])* bulkDensity[1]
    for i in range(2,N, step = 1)

        lithostatic[i] = 9.81 * (depth[i] - depth[i-1]) * bulkDensity[i] + lithostatic[i - 1]
    end

    hydrostatic = @. 9.81 * rho_f * (depth + depth_0)
    # Total pore pressure
    pressure_0 = (overburden * ones(N)) .+ lithostatic
    # Excess Pore Pressure above hydrostatic (flow pressure)
    pressure_i = pressure_0 .- hydrostatic

    # Calc overpressure gradient
    dp_lithdz = 9.81*(bulkDensity .- rho_f)
    dp_lithdz = mean(dp_lithdz)

    return pressure_i, pressure_0, hydrostatic, Temp, dTempdt, dsigmadt, dp_lithdz
end

## Diff Mats

function ddz(z0, zL, n, v)
    ## This is from MatLab MatMol group (2009) 
    ## Second-order directional differencing for MOL

    dz = (zL-z0)/(n-1)
    r2fdz = 1 / (2*dz)

    # Finite difference for positive v (indicates flow direction)
    if v > 0
    
        D = spdiagm(-2 => ones(n-2), -1 => -4*ones(n-1), 0 => 3 * ones(n))

        ## Boundaries 

        D[1, 1:3] = [-3, 4, -1]
        D[2, 1:3] = [-1, 0, 1]

    end

    if v < 0

        D = spdiagm(0 => -3*ones(n), 1 => 4*ones(n-1), 2 => -ones(n-2))

        ## Boundaries 

        D[n-1, (n-2):n] = [-1, 0, +1]
        D[n, (n-2):n] = [1, -4, 3]

    end

    D = r2fdz*D

    return D
end

function d2dz2(z0, zL, n)
    ## From Matlab MatMol group (2009)

    ## Four point second order approx. of the second derivative
    ## See the MatMol 2009 docs to see the derivation 

    dz = (zL-z0)/(n-1)
    rdzs = 1 / (2*dz)

    # interior
    D = spdiagm(-1 => ones(n-1), 0 => -2*ones(n), 1 => ones(n-1))

    # boundaries 
    D[1, 1:4] = [2, -5, 4, -1]
    D[n, (n-3):n] = [-1, 4, -5, 2]

    D = rdzs*D

    return D
end

## Sparsity Pattern
function sparsity_pattern(N)
    iD = 1:N
    iSup = 1:N-1
    iSupSup = 1:N-2

    # Pressure section
    P_p = sparse(iD, iD, ones(N)) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSup.+1, iSup, ones(N-1), N, N) +
        sparse(iSupSup, iSupSup .+ 2, ones(N-2), N, N)
    P_p[N, N-2] = 1
    P_p[N, N-3] = 1

    P_T = sparse(iD, iD, ones(N))
    P_phi = sparse(iD, iD, ones(N)) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSupSup, iSupSup .+ 2, ones(N-2), N, N)
    P_phi[N-1, N-2] = 1
    P_phi[N, N-1] = 1
    P_phi[N, N-2] = 1
    P_m = P_T

    Jac_P = sparse_hcat(P_p, P_T, P_phi, P_m)

    # Temperature equation
    T_p = sparse(iD, iD, [ones(N-1); 0]) + sparse(iSup, iSup .+ 1, ones(N-1), N, N) +
        sparse(iSup.+1, iSup, [ones(N-2); 0], N, N)
    T_T = T_p
    T_phi = P_phi
    T_m = sparse(iD, iD, [ones(N-1); 0])
    Jac_T = sparse_hcat(T_p, T_T, T_phi, T_m)

    # Porosity section is equal to the pressure section
    Jac_m = sparse_hcat(P_T, P_T, spzeros(N,N), P_T)

    # Combine all the sections
    S = sparse_vcat(Jac_P, Jac_T, Jac_P, Jac_m)
    
    S = convert(SparseMatrixCSC{Bool, Int64}, S)
    return S
end

function dudt!(du, u, p, t)
	
    # parameters 

    (m_i, M_H2O, A, c0, hydro, n_r, 
    E_a, R, T0, litho, doubleSnake, 
    ρ_f, η, k_i, c_m, μ, n, dsigmadt, 
    P_init_b, dz, dp_lithdz, alpha, 
    K, G, K_f, C_s, K_T, L, dTempdt, 
    ϕ_0, D1, DD2, dpdz, dpdzz, dϕdz, 
    dTdzz, N, p_eff, dPHIdt,
    a0, a1, a2, g0) = p

    # variables
    p_ex  = @view u[:, 1]
    T     = @view u[:, 2]
    ϕ     = @view u[:, 3]
    m_d   = @view u[:, 4]
	
    mul!(dpdz, D1, p_ex)
    mul!(dpdzz, DD2, p_ex)
    mul!(dϕdz, D1, ϕ)
    mul!(dTdzz, DD2, T)

    ### Dehydration
    #dM_d/dt 
    # @timeit to "M_D" @. du[:,4] =  (1/m_i) * M_H2O * A * c0 * (m_i - m_d) * 
    # (IfElse.ifelse(DeltaG(T, p_ex + hydro) < 0, abs(DeltaG(T, p_ex + hydro))^n_r, 0)) *
    # exp((-E_a/R)*(1/T - 1/T0))
    
    @inbounds for i in 1:20000

        DeltaG = -g0*((1/(2*a2))*(-a1 + sqrt(a1^2-4*a2*(a0-((p_ex[i] + hydro[i])/1e5))))-(T[i]-273))

        du[i,4] = (1/m_i) * M_H2O * A * c0 * (m_i - m_d[i]) * 
        (IfElse.ifelse(DeltaG < 0, abs(DeltaG)^n_r, 0)) *
        exp((-E_a/R)*(1/T[i] - 1/T0))
    end
    
    @inbounds for i in 20001:N
    # last 1000 nodes do not react
        du[i, 4] = 0
    end

    ### Pressure
    # @timeit to "p_eff" p_eff = @. litho + dsigmadt*t - (p_ex + hydro) # Effective Pressure
    # @timeit to "dPhi" dPHIdt = @. (doubleSnake/ρ_f) * du[:,4] - (3/4)*ϕ* p_eff / η 

    #dP/dt
    # @timeit to "dPdt" @. du[begin:end-1000, 1] = (k_i * c_m/μ) * ((ϕ[begin:end-1000] / ϕ_0)^n) * ((n/ϕ[begin:end-1000]) * dϕdz[begin:end-1000] * dpdz[begin:end-1000] + dpdzz[begin:end-1000]) +
    #     c_m*(du[begin:end-1000,4]/ρ_f - dPHIdt[begin:end-1000]) + dsigmadt  
    
    #@timeit to "dPdt1000" @. du[end-1000+1:end, 1] =  (k_i * c_m/μ) * ((ϕ[end-1000+1:end] / ϕ_0)^n) * ((n/ϕ[end-1000+1:end]) * dϕdz[end-1000+1:end] * dpdz[end-1000+1:end] + dpdzz[end-1000+1:end]) + dsigmadt  

    # Upper absorbing boundary

    @inbounds for i in 1:20000
        p_eff[i] = litho[i] + dsigmadt*t - (p_ex[i] + hydro[i]) # Effective Pressure
        dPHIdt[i] = (doubleSnake/ρ_f) * du[i,4] - (3/4)*ϕ[i]* p_eff[i] / η 
        du[i,1] = (k_i * c_m/μ) * ((ϕ[i] / ϕ_0)^n) * ((n/ϕ[i]) * dϕdz[i] * dpdz[i] + dpdzz[i]) +
        c_m*(du[i,4]/ρ_f - dPHIdt[i]) + dsigmadt
        
    end

    @inbounds for i in 20001:N
        p_eff[i] = litho[i] + dsigmadt*t - (p_ex[i] + hydro[i]) # Effective Pressure
        dPHIdt[i] = (doubleSnake/ρ_f) * du[i,4] - (3/4)*ϕ[i]* p_eff[i] / η 
        du[i, 1] =  (k_i * c_m/μ) * ((ϕ[i] / ϕ_0)^n) * ((n/ϕ[i]) * dϕdz[i] * dpdz[i] + dpdzz[i]) + dsigmadt 
    end

    du[1,1] = (k_i*c_m/μ) * ((ϕ[1]/ϕ_0)^n) * (n/ϕ[1]) * dϕdz[1] * dpdz[1] + (c_m/η)*(3/4)*ϕ[1]*p_eff[1] + dsigmadt

    # bottom boundary 
    pN = P_init_b + dz * dp_lithdz + dsigmadt*t 
    dpdzzN = (p_ex[end-1] - 2*p_ex[end] + pN)/ dz^2
    
    du[end, 1] = (k_i*c_m/μ) * ((ϕ[end]/ ϕ_0)^n) * ((n/ϕ[end]) * dϕdz[end] * dpdz[end] + dpdzzN) + 
        c_m*(du[end,4]/ ρ_f - dPHIdt[end]) + dsigmadt

    ## Porosity
    #dϕ/dt
    # @timeit to "phi" @. du[1:end-1000,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    # (1/c_m - ϕ[1:end-1000]/K_f)* du[1:end-1000,1] +
    # dPHIdt[1:end-1000]

    # @timeit to "phi1000" @. du[end-1000+1:end,3] = -(alpha/(K+4*G/3))*dsigmadt + 
    # (1/c_m - ϕ[end-1000+1:end]/K_f)* du[end-1000+1:end,1]

    @inbounds for i in 1:20000
        du[i,3] = -(alpha/(K+4*G/3))*dsigmadt + 
        (1/c_m - ϕ[i]/K_f)* du[i,1] +
        dPHIdt[i]
    end

    @inbounds for i in 20001:N
        du[i,3] = -(alpha/(K+4*G/3))*dsigmadt + 
        (1/c_m - ϕ[i]/K_f)* du[i,1]
    end


    ## Temperature

    # conduction only
    #@timeit to "T" @. du[:,2] = (1/C_s) * (K_T * dTdzz - L* @view du[:, 4])
    @inbounds for i in 1:N
        du[i,2] = (1/C_s) * (K_T * dTdzz[i] - L* du[i, 4])
    end

    # Upper boundary
    du[1,2] = -(L/C_s) * du[1,4]

    # Lower boundary constant
    du[end,2] = dTempdt 

    #done
    # print_timer()
	nothing
    
end

### Parameters
m_i = 320         # Initial bound water
C_s = 2.5e6       # volumetric heat capacity  
k_i = 1e-20       # background permeability
K_f = 2.9e9       # fluid bulk modulus
K_T = 2.25        # thermal conductivity
n   = 3           # k-ϕ powerlaw exponent
v_p = 7000        # p-wave velocity
v_s = v_p / 2.35  # s-wave velocity
α   = 0.2         # Biot parameter
η   = 1e15        # bulk viscosity
μ   = 2e-4        # fluid viscosity
ν   = 0.3894      # Poisson's ratio
ϕ_i = 0.02        # initial porosity
ϕ_0 = 0.015       # reference porosity 
ρ_f = 1e3         # fluid density
ρ_s = 3e3         # solid density

# Gibbs Terms
a0 = 65945.7007021266
a1 = -415.847466841169
a2 = 0.654522687965514
dT = 58
a0 = a0 - a1*dT - 2*a2*dT + a2*dT^2    
a1 = a1-2*a2*dT                        
g0 = -643.932375104237

## Computed Parameters
ρ_b = (1-ϕ_i)*ρ_s + ϕ_i*ρ_f   # Bulk Density
        
G = ρ_b * v_s^2               # Shear modulus

K_u = (2*(1+ν)*G)/(3*(1-2*ν)) # Undrained bulk modulus

# Drained bulk modulus
K = (1/(2*ϕ_i))*((K_f+K_u)*ϕ_i - α*(1+ϕ_i)*K_f + 
(4*(α-1)*(ϕ_i-α)*ϕ_i*K_f*K_u + 
(ϕ_i*(K_f+K_u)-α*(1+ϕ_i)*K_f)^2)^(1/2)) 

# Elastic Parameter
c_m = ((K_u - K)*(K + 4*G/3))/(α^2 * (K_u + 4*G/3))

## Dehydration Parameters
A     = 201             # Specific surface area of RLM
c0    = 1.29e-20        # Ref. reaction constant
E_a   = 2e4             # Activation energy
L     = 500e3           # latent heat of Rxn.
M_H2O = 0.018           # H2O molar mass
n_r   = 3.64            # Rxn. order 
R     = 1.987           # gas constant
T0    = 633             # Reference Temperature
doubleSnake = 0.1       # Plastic compaction parameter

### Discritization
N = 10000 # dehydrating nodes
NE = Int(0.1 * N) # Non dehydrating nodes
dz = 1000/N  # distance between nodes in meters
N = 2*N + NE # Number of nodes in the column
depth = collect(dz:dz:dz*N)

### Set up problem
# Initial conditions
P_i, litho, hydro, T_i, dTdt, dsigmadt, dP_lithdz = initial(ϕ_i, N, depth, 28.345e3, ρ_s, ρ_f)
Phi_i = ϕ_i * ones(N)
M_di = zeros(N)

u0 = cat(P_i, T_i, Phi_i, M_di; dims = 2)
v = -1 # upwards flow
D1  = ddz(depth[1], depth[N], N, v)
DD2 = d2dz2(depth[1], depth[N], N)

dpdz = similar(P_i)
dpdzz = similar(P_i)
dϕdz = similar(Phi_i)
dTdzz = similar(T_i) 

p_eff = similar(P_i)
dPHIdt= similar(P_i)
# Parameters
p = (m_i, M_H2O, A, c0, hydro, n_r, E_a, R, 
    T0, litho, doubleSnake, ρ_f, η, k_i, c_m, 
    μ, n, dsigmadt, P_i[end], dz, dP_lithdz, α, K, 
    G, K_f, C_s, K_T, L, dTdt, ϕ_0, D1, DD2,
    dpdz, dpdzz, dϕdz, dTdzz, N, p_eff, dPHIdt,
    a0, a1, a2, g0);
# Time -- need to be able to run for 120000 years
tspan = (0.0, 31536000 * 12000) 
# Build sparsity pattern
jac_sparsity = sparsity_pattern(N)
f = ODEFunction(dudt! ; jac_prototype = float.(jac_sparsity))
# Make problem
dehydrate_prob = ODEProblem(f, u0, tspan, p)
sol = solve(dehydrate_prob, FBDF(autodiff = false), saveat = tspan[2]/2000, save_everystep = false, abstol = 1e-12, reltol = 1e-10, progress = true, maxiters = Int(1e6))

Here @btime for tspan (0,12000) years: 15.616 s (1318071 allocations: 5.25 GiB)

But, still once I up the tspan to 120000 years, the solution progress seems to slow to a stop. Actually, it seems to be worse now. Before the progress would slow to a stop around 40% complete, and now it slows to a stop around 20%.

I will keep looking at ways to speed it up, but I am not sure why the speed is dropping off more now during the solve.

Oscar_Smith · January 22, 2024, 11:42pm

Are you sure that you didn’t introduce a bug porting the model from matlab to Julia? The easiest way to check is to take random inputs and feed them to the matlab and Julia functions (without the integrator) and making sure the results are the same.

foobar_lv2 · January 23, 2024, 12:15am

Have you tried using a worse tolerance, just to see what happens?

Have you tried looking at the time-steps? Your observations smell like your solution / the timestepping method walk into a regime where time-steps are very small, and you can then wait forever.

So you will need to look into that: Maybe the time-step control fails? That would be an easy fix.

Or maybe your equation just becomes bad in these parts of phase-space? Then it would be good to get the state in the bad parts; and plot the state, as well as the error estimates that force large time-steps. Maybe you can use your domain-knowledge to visually figure out what goes wrong, and maybe fix it?

(oh damn, if these parts of the front meet, of course a shock will develop and gradients will explode with my tiny viscosity. Bad model / discretization, do something else or YOLO disable adaptive time-steps)

atarias1 · January 23, 2024, 9:42am

Yeah you’re correct. I quickly looked at the results again this morning and compared it with the MATLAB output for the same time series and found different behavior. The MATLAB output is smoother and only produces a single wave at 12,000 years.

I will work on it today, and check if I ported everything correctly and then look at the discretization and solver again.

isaacsas · January 23, 2024, 12:32pm

Once your dudt! function is giving equivalent results in Matlab and Julia, you can @btime just calling that function with random inputs to confirm it isn’t allocating any memory. (If it is then you should keep revising it until it doesn’t allocate.) Only at that point would I worry about the performance comparison between Matlab and Julia for the actual solve.

gabo-di · January 23, 2024, 1:30pm

I just checked the old wrong code, @btime dudt!(...) reports 192 bytes and 4 allocations. And Profile.Allocs.@profile reports that inside dudt! the allocations happens calling ijl_gc_pool_alloc.
So if the code wouldn’t be wrong, dudt! would be already well enough optimized?

atarias1 · January 23, 2024, 2:34pm

Got it… I realized I wasn’t discretizing the Laplacian properly.

Here in the d2dz2 function:

rdzs = 1 / (2*dz)

Should be

rdzs = 1/ (dz^2)

This explains the weird behavior and instability. Silly mistake.
Now the results are the same as MATLAB.

I will compare runtimes for both Julia and MATLAB over night.
Currently, it seems the Julia code runs in around 30 min compared to the ~hour for MATLAB.
I’ll mark it as solved. Thanks to everyone for the help, this has been a learning experience.
Enjoy this nice .gif of the result (tspan~60000 yrs to fit the upload size requirement)!
anim

ChrisRackauckas · January 23, 2024, 2:49pm

Have you done using MKL? That’s pretty important when using UMFPACK for performance.

Also, this is a case where you may want to try Jacobian-free Newton Krylov approaches. See Solving Large Stiff Equations · DifferentialEquations.jl for details.

Topic		Replies	Views
DifferentialEquations -- what goes wrong? General Usage	14	2432	November 13, 2019
Help me outperform Matlab in numerical solution of semidiscretized PDE system as much as possible Modelling & Simulations performance , pde , modelingtoolkit , differentialequation	26	3194	March 31, 2022
JuliaDiffEq Vern9() solver taking too long General Usage question	18	2045	August 27, 2019
Difficulties with DifferentialEquations package Modelling & Simulations	13	890	December 2, 2018
6 Months of DifferentialEquations.jl: Where We Are and Where We Are Going Community announcement , diffeq	17	3333	September 26, 2017

MATLAB -> Julia, Diffeq cannot solve large system over long time-span

Related topics