What's the state of Automatic Differentiation in Julia January 2023?

Ahh interesting, didn’t know it only did the one way there. That must be why it currently has a feel of “I think it mostly works but a few things seem to get hit”. Is there a reason Enzyme cannot just mixed mode that? ReverseDiff first did that, where broadcast has a diagonal sparsity pattern so you can always switch that step to forward without a cost (and that can sometimes reduce cost)

1 Like

I mean it’s probably useful to mixed mode broadcast regardless.

However, what that PR does is generically say that autodiff of @cuda is @cuda of autodiff (which is presently needed by broadcasting among other things).

It’s definitely useful to also consider what higher level utilities we want to add – but that is useful as a baseline so generic code doesn’t need to pipe in an inner autodiff call inside all @cuda’s.

That’s why I’ll argue that the linked PR makes broadcast work in reverse mode (among other things), but will potentially get further improved with additional broadcast-specific tuning.

1 Like