Deep learning in Julia

gdalle · April 12, 2024, 7:57am

In my opinion, one of the key reasons why deep learning in Julia is light years behind PyTorch/JAX is the performance and convenience of Automatic Differentiation. There are so many AD packages, and each has its own tradeoff between speed and generality. I believe we should make it easy for users to pick the one that works best for them, which explains the creation of DifferentiationInterface.jl with @hill. That way, AD packages can coexist and even compete without causing confusion for downstream users. In addition, it reduces code duplication, because every ML ecosystem (Flux, Lux, Turing, SciML) has its own variant of an Enzyme/Zygote extension with gradient bindings, and we should just pool all of those.

I’ve been chatting with various power users of AD to see how they could leverage the interface. The conclusion is that it is much easier when what you want to differentiate is a vector (or ComponenVector), and not some arbitrarily complex (callable) struct like a Flux layer. So for this reason, I think Lux.jl is more suited to easy AD integration and backend switch.
Of course it doesn’t get us all of the way there, but to me it seems like a very important step.

Topic		Replies	Views
State of machine learning in Julia Machine Learning	60	67030	August 26, 2022
[ANN] Lux.jl: Explicitly Parameterized Neural Networks in Julia Package Announcements package , announcement , machine-learning	49	12086	April 16, 2024
Is it a good time for a PyTorch developer to move to Julia? If so, Flux? Knet? Machine Learning	52	25668	January 11, 2021
Machine Learning using Julia - Aim/Idealogy of Flux.jl to for simplicity over compexity for programmers Machine Learning question , flux , machine-learning	11	2061	February 8, 2022
Flux ready for a beginner deep learning project? Machine Learning flux	31	8885	June 20, 2019

Deep learning in Julia

Related topics