2nd order optimization for Julia

Lately, I’ve seen second-order optimization suddenly becoming a trend in deep learning.
Is this

  1. An opportunity that Julia neural network system can do well?
  2. Something that would rather shift the balance toward PyTorch/etc?
  3. Just yet another development that doesn’t shift the landscape too much?

Julia is said to support high-order automatic differentiation. Now that 2nd order optimization is becoming more popular in neural network, is this an opportunity for Julia?

Here’s arguably the paper that started the trend.

The landscape is changing fast.

Not really relevant to deep learning (because we’re still mostly CPU-focused) but the tryptic DifferentiationInterface.jl + SparseConnectivityTracer.jl + SparseMatrixColorings.jl allows you to compute sparse Hessians effortlessly, which is key for second-order methods on large-scale problems.

1 Like