Performance of derivatives of Ensemble problem

Hi everyone,

I’m working on simulating trajectories of a Geometric Brownian Motion (GBM) and calculating sensitivities using automatic differentiation. I’m comparing two approaches:

  1. Custom GBM implementation where I directly simulate the solution.
  2. Using DifferentialEquations.jl to solve the Stochastic Differential Equation (SDE) form of GBM, leveraging its built-in methods.

While I expected the second approach to be slower, the performance difference is significant. I was wondering if there’s a way to optimize the performance when using DifferentialEquations.jl.

Here’s my code where I compare the two approaches and calculate gradients using ForwardDiff.jl:

using BenchmarkTools
using ForwardDiff
using DifferentialEquations

# Function to compute value and derivatives using ForwardDiff
function value_and_derivatives(f, x; second_order=false)
    if second_order
        # Compute the value, gradient, and Hessian
        hess_result = DiffResults.HessianResult(x)
        ForwardDiff.hessian!(hess_result, f, x)

        val  = DiffResults.value(hess_result)
        grad = DiffResults.gradient(hess_result)
        hess = DiffResults.hessian(hess_result)

        return val, grad, hess
    else
        # Compute the value and gradient only
        grad_result = DiffResults.GradientResult(x)
        ForwardDiff.gradient!(grad_result, f, x)

        val  = DiffResults.value(grad_result)
        grad = DiffResults.gradient(grad_result)

        return val, grad
    end
end

# Custom GBM simulation function
function gbm(S₀, K, μ, σ, N, dt, trajectories)
    drift = (μ - 0.5 * σ^2) * dt
    diffusion = σ * sqrt(dt)

    S = zeros(eltype(K), trajectories, N + 1)
    S[:, 1] .= S₀
    out = 0.0

    for i in 1:trajectories
        for j in 2:N+1
            z = randn()  # Generate a random standard normal variable
            S[i, j] = S[i, j - 1] * exp(drift + diffusion * z)
        end
        out += max(S[i, end] - K, 0.0)  # Calculate the payoff
    end

    return out / trajectories  # Return the average payoff
end

# Function to simulate GBM using DifferentialEquations.jl
function gbm_diffeq(x, prob, trajectories)
    _prob  = remake(prob, u0 = x[1], p = @view x[3:4])  # Update the problem with new parameters
    __prob = EnsembleProblem(_prob)

    S = Array(solve(__prob, trajectories=trajectories, saveat=1.0))  # Solve the SDE

    out = 0.0
    for i in 1:trajectories
        out += max(S[end, i] - x[2], 0.0)  # Calculate the payoff
    end

    return out / trajectories  # Return the average payoff
end

# SDE functions (drift and diffusion)
function f(u, p, t)
    return u * p[1]  # Drift: μ * S
end

function g(u, p, t)
    return u * p[2]  # Diffusion: σ * S
end

# Problem setup
u0 = 1.0  # Initial value
p  = [0.1, 0.2]  # Parameters [μ, σ]

prob = SDEProblem(f, g, u0, (0.0, 1.0), p)  # Define the SDE problem

# Parameters for the custom GBM simulation
S0 = 100.0  # Initial stock price
K  = 90.0   # Strike price
μ  = 0.05   # Drift (μ)
σ  = 0.2    # Volatility (σ)
T  = 1.0    # Time horizon
N  = 100    # Number of time steps
dt = T / N  # Time step size
trajectories = 100_000  # Number of trajectories

# Input vector for GBM simulations
x = [S0, K, μ, σ, T]

# Define the functions for benchmarking
out(x)    = gbm(x[1], x[2], x[3], x[4], N, dt, trajectories)
out_diffeq(x) = gbm_diffeq(x, prob, trajectories)

# Benchmark the performance
@btime $out($x)
@btime $out_diffeq($x)

# Compute values and gradients using ForwardDiff
@btime $value_and_derivatives($out, $x) # Takes 2.663 ms (11 allocations: 13.73 MiB)

@btime $value_and_derivatives($out_diffeq, $x) # Takes   16.159 s (7498529 allocations: 991.04 MiB)


I’m looking for any tips or optimizations that can help improve the performance of the DifferentialEquations.jl approach. Let me know if you have any suggestions!

Three things.

  1. Using an analytical solution is just a good idea if you have one.
  2. If you have that simple of an SDE, you can use SimpleDiffEq.jl’s SimpleEM.
  3. By default it’s going to use a high strong order method, which normally has a lot lower error than the EM method SDE Basic Work-Precision Diagrams · The SciML Benchmarks. That said, the adaptivity may be making the dt small in order to make the error sufficiently small.
  4. For such an ensemble, using DiffEqGPU for GPU accelerating the Monte Carlo simulation makes sense.
  5. Your implementation is slow because it’s allocating a lot. You don’t need to make the Array on S: that makes a big allocation that is unnecessary. EnsembleProblem(_prob; safetycopy=false).

So try a few of those things. And look at the flame graph and reduce the major allocations.

1 Like

Thank you very much, Chris!