[ANN] Yota v0.5 - now with ChainRules support

Any performance benchmarks vs Zygote and ReverseDiff on things with scalar iteration (i.e. not ML)?

No benchmarks for Yota v0.5 yet. I have a preliminary plan to add regular benchmarks in CI/CD for several repositories (at least Yota and NNlib), but want to stabilize things first.

In general, I expect Yota to be somewhat faster on high-dimensional ML problems and slower on scalar and low-dimensional problems. This is because Yota doesn’t optimize small constant factors like type stability which usually takes < 1% of overall run time in ML tasks. At the same time, Yota cares about things like array preallocation, kernel fusion, etc. which you don’t need in scalar functions, but which becomes a killer feature in high-dimensional tasks.

Also note that all explicit optimizations have been removed during the refactoring and will be slowly added back in future versions. Benchmarks suggestions (both - in ML and non-ML domains) are highly welcome as they will help to track the progress.

What’s the mutation support like?

No mutation support is intended. If you have a mutating operation, you can try to wrap it into an rrule() to “hide” it from the AD engine, but the engine itself will not attempt to handle it.

A lot have been told about mutating in Zygote, but I should highlight a couple of important reasons for not supporting it in Yota:

  1. Mutation often means slower code, not faster. For example, filling an array element-by-element like x[i] = y may be fast on CPU, but for GPU it is a disaster. Yota explicitly shouts out when it sees something we don’t have a fast way to do. Surely, it restricts a number of use cases, but at the same time it helps to uncover some of the performance bugs.
  2. Since Yota produces an easy to handle tape, in many cases it should be possible to optimize it in a way that replaces possibly non-optimal immutable operations into their mutable counterparts.
6 Likes