KernelMatrices.jl - A software package for working with hierarchical matrices



Hello everyone,

I would like to announce a new software package for working with and compressing kernel matrices. The package KernelMatrices.jl provides a nicely overloaded KernelMatrix struct, an implementation of the Adaptive Cross Approximation and Nystrom approximation, and a complete interface for compressing kernel matrices in the HODLR format, including factorization, solve, logdeterminant, and some other tools specific to Gaussian Processes. For maximum likelihood estimation, it provides a method to approximate MLEs with quasilinear scaling that will work with very little extra user code.

I announce this mainly to see if other people would be interested in using this software, as that will to some degree dictate how much work I put into it, both in improving the guts (like cleaning up some of the overloading that got mangled when I didn’t really update the code to 1.0 in a completely conscientious way, or thinking harder about how to use the type system, or working harder to make the code more parallelizable) and in implementing other factorizations and matrix compression structures (like HSS, H^2, skeletonization, peeling assembly methods, and so on).

I would also love to have contributions and work on this code with other people, and so if anybody out there is interested in contributing or extending the package please feel free to reach out to me at or submit a pull request or file an issue or anything at all.

I also would be very interested to hear feedback and comments about the quality of the package’s source code in general. I am always looking to improve and write better code.

Thank you for reading,
Chris Geoga (


That looks very nice! I’m not very familiar with the subject, sorry if that’s a stupid question: the package looks geared towards statistical applications, does it also work for PDEs (if I want to get quick matvecs with a green’s function) or are the techniques substantially different?


Hi Antoine—thank you very much! I am not very familiar with methods for numerical PDEs, so I am hesitant to confidently say that this would also work for applications in that field. But I know that many innovations in hierarchical matrices came out of that field, and I can confidently say that you could very easily compress your matrix to a prescribed accuracy and then perform a matvec with this package.


Looks interesting! Note there is also HierarchicalMatrices.jl which is more focussed on (integral reformulations of) PDEs, though it is not under active development (for the moment).


Hello Sheehan—thank you for the reference! I have seen this package, and I think it is incredibly elegant. If one just wanted a fast matvec, that package does look like a very reasonable choice. To my knowledge it implements a weakly admissible structure which is quite similar to the HODLR structure that I have implemented.

I think the primary use case for our software instead of this one would be when one wants direct solves or determinants (via a symmetric factorization). I don’t know how many people outside of statistics care about log-determinants, but I imagine (hope!) that there are plenty of other people who would be interested in direct solves instead of iterative ones.

I also would love to implement some algorithms that assemble matrices using only matvecs (like the peeling algorithm of Lin et al. 2011), which would provide more novel functionality. I’m hoping to gauge interest in how many people use julia for applications where they have fast black-box matvecs where matrix entries can’t easily be computed. The package LowRankApprox.jl seems like it would provide great guts for the functionality, so a lot of the pieces are around already…


You are right, @MikaelSlevinsky never got around to implementing solves it looks like. He did in the predecessor:


Oh wow. That is very nice code. Aside from a decent amount of functionality specific to maximum likelihood estimation in statistics, I suppose our package is just a different flavor than HierarchicalMatrices.jl then (rather than providing meaningfully different functionality). There are some small other goodies in it, like an implementation of the ACA, but nothing that requires serious R&D. I hope to also think about H^2/HSS/etc matrices, peeling, and some other matrix compression methods, which would be somewhat serious R&D I think. But none of those are presently implemented here.

The primary reason that I wrote new software rather than contributing to existing software is because, for the statistical applications we’re working on, we want the first and second derivatives of the compressed matrices. So I needed a low-rank approximation that is continuous to kernel parameters, and I needed to implement the first and second derivatives. After some looking around, and since weakly admissible matrices and factorizations are the relatively easy part to implement, I decided it would be easier to just write our own. But for numerical PDEs hierarchicalmatrices.jl may well be a better choice, as I am not aware of applications where people need to get the derivatives of the compressed kernel matrix (or for the compressed matrix to be differentiable in the first place).


Count a +1 from me :slight_smile: A typical use case would be to approximate the inverse of a differential operator (eg -Delta on some n-dimensional domain). Inverses can be very expensive to compute, but matvecs can be done.


The README examples don’t seem to work.

julia> HK_a = HODLR.KernelHODLR(K, tol, lvl, rnk, nystrom=false, plel=pll)
ERROR: MethodError: no method matching KernelMatrices.HODLR.KernelHODLR(::KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}, ::Float64, ::KernelMatrices.HODLR.LogLevel, ::Int64; nystrom=false, plel=false)
Closest candidates are:
  KernelMatrices.HODLR.KernelHODLR(::KernelMatrices.KernelMatrix{T<:Number,A} where A, ::Float64, ::Int64, ::KernelMatrices.HODLR.HierLevel; nystrom, plel) where T<:Number at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/HODLR/src/constructor.jl:9
 [1] top-level scope at REPL[39]:1

reordering the arguments, does:

julia> HK_a = HODLR.KernelHODLR(K, tol, rnk, lvl, nystrom=false, plel=pll);

same problem with the next function:

julia> HK_n = HODLR.KernelHODLR(K, tol, lvl, rnk, nystrom=true, plel=pll);
ERROR: MethodError: no method matching KernelMatrices.HODLR.KernelHODLR(::KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}, ::Float64, ::KernelMatrices.HODLR.LogLevel, ::Int64; nystrom=true, plel=false)
Closest candidates are:
  KernelMatrices.HODLR.KernelHODLR(::KernelMatrices.KernelMatrix{T<:Number,A} where A, ::Float64, ::Int64, ::KernelMatrices.HODLR.HierLevel; nystrom, plel) where T<:Number at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/HODLR/src/constructor.jl:9
 [1] top-level scope at util.jl:156

But this time, reordering arguments gives me a different error:

julia> HK_n = HODLR.KernelHODLR(K, tol, rnk, lvl, nystrom=true, plel=pll);
ERROR: BoundsError: attempt to access 0-element Array{SArray{Tuple{2},Float64,1,2},1} at index [1]
 [1] getindex(::Array{SArray{Tuple{2},Float64,1,2},1}, ::Int64) at ./array.jl:728
 [2] nystrom_uvt(::KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}, ::KernelMatrices.NystromKernel{Float64}) at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/factorizations.jl:96
 [3] (::getfield(KernelMatrices.HODLR, Symbol("##18#24")){KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}})(::SArray{Tuple{4},Int64,1,4}) at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/HODLR/src/constructor.jl:31
 [4] _collect(::Array{SArray{Tuple{4},Int64,1,4},1}, ::Base.Generator{Array{SArray{Tuple{4},Int64,1,4},1},getfield(KernelMatrices.HODLR, Symbol("##18#24")){KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}}}, ::Base.EltypeUnknown, ::Base.HasShape{1}) at ./generator.jl:47
 [5] collect_similar at ./array.jl:548 [inlined]
 [6] map at ./abstractarray.jl:2028 [inlined]
 [7] mapf at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/HODLR/src/utils.jl:14 [inlined]
 [8] #KernelHODLR#15(::Bool, ::Bool, ::Type{KernelMatrices.HODLR.KernelHODLR}, ::KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}, ::Float64, ::Int64, ::KernelMatrices.HODLR.LogLevel) at /home/chris/.julia/packages/KernelMatrices/GuD8m/src/HODLR/src/constructor.jl:31
 [9] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:nystrom, :plel),Tuple{Bool,Bool}}, ::Type{KernelMatrices.HODLR.KernelHODLR}, ::KernelMatrices.KernelMatrix{Float64,SArray{Tuple{2},Float64,1,2}}, ::Float64, ::Int64, ::KernelMatrices.HODLR.LogLevel) at ./none:0
 [10] top-level scope at util.jl:156

but this does.

My dissertation is due in a couple months, and I’m considering discussing this as a possible alternative to something I’ve done.

From glancing at the examples, it looks like there are a few things suboptimal for performance, eg

julia> @benchmark SVector{2, Float64}(randn(2))
  memory estimate:  96 bytes
  allocs estimate:  1
  minimum time:     91.302 ns (0.00% GC)
  median time:      96.838 ns (0.00% GC)
  mean time:        117.473 ns (12.09% GC)
  maximum time:     4.313 μs (96.38% GC)
  samples:          10000
  evals/sample:     956

julia> @benchmark @SVector randn(2)
  memory estimate:  0 bytes
  allocs estimate:  0
  minimum time:     25.912 ns (0.00% GC)
  median time:      27.870 ns (0.00% GC)
  mean time:        28.929 ns (0.00% GC)
  maximum time:     68.825 ns (0.00% GC)
  samples:          10000
  evals/sample:     996

I haven’t looked at the actual library code itself, but I would guess there may be a lot of needless allocations.


Hello Chris Elrod—

Thank you so much for the note, and please accept my apologies for the broken README code. I stupidly free-handed those example code snippets and was not careful to check that they ran. I will fix that immediately.

With regard to the needless allocations, I never even thought to check that the macro would be faster than the explicit type converstion. I am not super savvy with the best micro-optimizations in Julia (or any language), but I’m sure you’re right that the code could be tightened up in many places. I will take a look at that and think harder about benchmarking/profiling/optimizing the code. If you have other suggestions or thoughts or first-hand experiences to share about tightening up code of this kind, I’m very interested to hear them.