Troubles using sparse autodifferentiation

Hi all, I am following Advanced tutorial · DifferentiationInterface.jl and to try out some of the modern AD features it provides. My test case is here FunWithSplines/workspace/MLE/core-5_4_1.jl at main · slwu89/FunWithSplines · GitHub where I need to compute a Hessian matrix which is very sparse. I am able to compute the dense Hessian no problem, but using the sparse backend results in the following error pasted below (the last few lines of the example compute the Hessian). I believe it is due to a comparison between numeric types in the underlying mathematical model. Can someone give any advice about how to still use sparse AD in this case?

For the curious, the model is from 5.4.1 of https://www.cambridge.org/us/universitypress/subjects/statistics-probability/statistical-theory-and-methods/core-statistics.

ERROR: TypeError: non-boolean (SparseConnectivityTracer.HessianTracer{SparseConnectivityTracer.DictHessianPattern{Int64, BitSet, Dict{Int64, BitSet}, SparseConnectivityTracer.NotShared}}) used in boolean context
Stacktrace:
  [1] model_urchin_vol(ω::Float64, g::SparseConnectivityTracer.HessianTracer{…}, p::SparseConnectivityTracer.HessianTracer{…}, a::Float64)
    @ Main ~/Desktop/git/FunWithSplines/workspace/MLE/core-5_4_1.jl:41
  [2] nlyfb_urchin(b::Vector{SparseConnectivityTracer.HessianTracer{…}}, θ::Vector{Float64}, urchin::DataFrame)
    @ Main ~/Desktop/git/FunWithSplines/workspace/MLE/core-5_4_1.jl:59
  [3] (::DifferentiationInterface.FixTail{…})(args::Vector{…})
    @ DifferentiationInterface ~/.julia/packages/DifferentiationInterface/Yk2Kt/src/utils/context.jl:169
  [4] trace_function(::Type{…}, f::DifferentiationInterface.FixTail{…}, x::Vector{…})
    @ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/litqn/src/trace_functions.jl:48
  [5] _hessian_sparsity(f::DifferentiationInterface.FixTail{…}, x::Vector{…}, ::Type{…})
    @ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/litqn/src/trace_functions.jl:115
  [6] hessian_sparsity
    @ ~/.julia/packages/SparseConnectivityTracer/litqn/src/adtypes_interface.jl:64 [inlined]
  [7] hessian_sparsity_with_contexts
    @ ~/.julia/packages/DifferentiationInterface/Yk2Kt/ext/DifferentiationInterfaceSparseConnectivityTracerExt/DifferentiationInterfaceSparseConnectivityTracerExt.jl:62 [inlined]
  [8] prepare_hessian_nokwarg(::Val{…}, ::typeof(nlyfb_urchin), ::AutoSparse{…}, ::Vector{…}, ::Constant{…}, ::Constant{…})
    @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/Yk2Kt/ext/DifferentiationInterfaceSparseMatrixColoringsExt/hessian.jl:29
  [9] hessian(::typeof(nlyfb_urchin), ::AutoSparse{…}, ::Vector{…}, ::Constant{…}, ::Constant{…})
    @ DifferentiationInterface ~/.julia/packages/DifferentiationInterface/Yk2Kt/src/fallbacks/no_prep.jl:103
 [10] top-level scope
    @ ~/Desktop/git/FunWithSplines/workspace/MLE/core-5_4_1.jl:104
Some type information was truncated. Use `show(err)` to see complete types.

The TracerSparsityDetector tracers don’t work with comparison operators, you might need to use TracerLocalSparsityDetector, GitHub - adrhill/SparseConnectivityTracer.jl: Fast operator-overloading Jacobian & Hessian sparsity detection. .

1 Like

Thanks! That works and I get a beautiful sparse hessian.

284×284 SparseArrays.SparseMatrixCSC{Float64, Int64} with 568 stored entries:
⎡⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠳⣄⠀⎥
⎢⢤⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠓⎥
⎢⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⠀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠑⢄⎦
5 Likes

Note that to leverage the benefits of sparse AD, you should use preparation ahead of time and then reuse the preparation results. I saw from your stack trace that you’re not doing that, perhaps you’re just at an earlier step of the tutorial.
However, the use of local sparsity tracers means that the resulting pattern theoretically depends on the input you provided. If you know that the pattern detected on the preparation input is actually input-independent, you can reuse it all you want, but the responsibility for that proof is yours. In any case, it is best to prepare with a random input than, say, an input full of zeros.

1 Like

Hey thanks for following up! Yes in my full use case, I am using the prepare_hessian method to set up those objects.

Thanks for the advice on the local sparsity detector! Indeed after you pointed it out I do see that note in the docs Getting Started · SparseConnectivityTracer.jl. That’s a good point, for “reasonable” parameter values based on the underlying biological model, it would have the same pattern but I can’t guarantee that across all possible values of the parameters. Based on some quick tests, the values around 0 should correspond to a less sparse pattern and values farther from 0 should be more sparse, as a strict subset (i.e. formerly dense entries are being set to 0). I’d need to do some work to definitively check.

There are other alternatives if you can’t guarantee that local tracing is enough:

  • you can use another sparsity detector, like Symbolics.SymbolicsSparsityDetector() (which will likely be slower but support more language constructs)
  • you can (maybe) rewrite your function to replace comparison operators with ifelse statements
1 Like

Thanks, for the example, I was able to use Symbolics.SymbolicsSparsityDetector() but only after replacing the comparison < clause between numeric variables with Base.ifelse. Thanks for the help; is the list of sparsity detectors here exhaustive? Advanced features · DifferentiationInterface.jl

Thanks to this modification you might also be able to use SCT.TracerSparsityDetector() directly?

To the best of my knowledge yes

1 Like

You can also use Enzyme from within a Reactant.jl @compile (not available within DI) to leverage automatic sparsity propagation.

2 Likes

Thanks. I’ll definitely check this out if I get into more serious applications of AD. For now I’m just poking around for fun at seeing what the ecosystem looks like for implementing statistics model of mid-level complexity (more than a call to fit a linear model, less than shape constrained splines embedded in a mixed model embedded in…)

Is this what used to be called Spadina? Is it documented anywhere? Do you have an example code?

No, Spadina is separate work enabled in Enzyme (though has not been given syntactic sugar in Enzyme.jl).

Reactant.jl (and Enzyme-JaX) contains a variety of sparsity propagation optimizations that apply to general codes, but are particularly useful for optimizing programs containing derivatives. Reactant docs are available online here: Reactant.jl, essentially just use @compile on the outermost function in your application (though see docs for more info)

1 Like