GP-LVM with Turing?

OK, re-ran some models with different AD backends. I upped the sample to 60 (still with three dimensions), and did 1000 iterations with NUTS.

ReverseDiff (no rdcache): 3422 seconds (one run)
ReverseDiff (with rdcache): 730-760 seconds (two runs)
ForwardDiff: 339-348 seconds (two runs)
Zygote: 138-147 seconds (two runs)

So Zygote is by far the fastest, and the standard reversediff (with no memoization) is… not.

2 Likes