Any performance benchmarks vs Zygote and ReverseDiff on things with scalar iteration (i.e. not ML)?
No benchmarks for Yota v0.5 yet. I have a preliminary plan to add regular benchmarks in CI/CD for several repositories (at least Yota and NNlib), but want to stabilize things first.
In general, I expect Yota to be somewhat faster on high-dimensional ML problems and slower on scalar and low-dimensional problems. This is because Yota doesn’t optimize small constant factors like type stability which usually takes < 1% of overall run time in ML tasks. At the same time, Yota cares about things like array preallocation, kernel fusion, etc. which you don’t need in scalar functions, but which becomes a killer feature in high-dimensional tasks.
Also note that all explicit optimizations have been removed during the refactoring and will be slowly added back in future versions. Benchmarks suggestions (both - in ML and non-ML domains) are highly welcome as they will help to track the progress.
What’s the mutation support like?
No mutation support is intended. If you have a mutating operation, you can try to wrap it into an rrule()
to “hide” it from the AD engine, but the engine itself will not attempt to handle it.
A lot have been told about mutating in Zygote, but I should highlight a couple of important reasons for not supporting it in Yota:
- Mutation often means slower code, not faster. For example, filling an array element-by-element like
x[i] = y
may be fast on CPU, but for GPU it is a disaster. Yota explicitly shouts out when it sees something we don’t have a fast way to do. Surely, it restricts a number of use cases, but at the same time it helps to uncover some of the performance bugs. - Since Yota produces an easy to handle tape, in many cases it should be possible to optimize it in a way that replaces possibly non-optimal immutable operations into their mutable counterparts.