Machine Learning using Julia - Aim/Idealogy of Flux.jl to for simplicity over compexity for programmers

I had one question while learning Flux.jl library, what is the reason behind the minimal complexity/implementation ( less things done behind the hood ) by Flux.jl?

Why isn’t Flux aiming to be something like tensorflow or pytorch? Coming from a python I found those 2 frameworks really incredible and easy for a beginner to pickup and create their own models , like almost all my ML friends in college use one of those two frameworks. Even though, this minimal implementation is extremely useful for a true researcher to make models and customize them but really difficult for a beginner to get going.


There’s plenty of reasons behind going for the approach that we have. It allows us to make use of packages in the Julia ecosystem and extend the capabilities of Flux far quicker. Take MKL for example. TF/ PyTorch have a bunch of flags in their codebase to check for the availability of a faster BLAS, and implementations for the same catered specifically to the framework. On our end, we have the ability to use MKL by simply replacing the BLAS library at runtime, and that works well out of the box. This would have been difficult without multiple dispatch, so in some ways we are leveraging the tools available to us.

The big question to ask is whether this approach limits us. Well we have a maturing set of ML models, implementations of non standard deep learning, scientific machine learning, transformers etc. We have also been more focussed on ensuring we don’t lose out on performance either. We are able to scale training to multiple GPUs with the likes of DaggerFlux, FluxMPI and ResNetImageNet etc as well.

And another point is that it isn’t like there isn’t complexity. With our goal for general differentiable programming, we need to be able to support a wide variety of code patterns, including those not possible or easily accessible to TF/ Pytorch. Zygote/ Diffractor, ChainRules etc have a bunch of sunk effort already in them.

There are some things that differ for sure. We don’t really have too many utility functions in Flux - we encourage use of Julia directly or packages available to us for most of them for a better more complete implementation.

The binding thought is - we don’t necessarily need all the complexity to be competitive (we do have it in places that require it). Thanks in large part to Julia itself. Something as simple as inbuilt performant arrays means we don’t have to worry about a numpy and how it interacts with the rest of the system. As i mentioned earlier, it’s not like there isn’t complexity, but we focused that complexity in a different area, such that user code is simple, extensible, flexible and fast.

Hope this helps


I don’t see any fundamental reason why Flux should be more difficult for a beginner than pytorch or tensorflow. Maybe is a matter of documentation? What did you find difficult to grasp? Do you have any suggestions on what to improve?


Thanks a lot . It was indeed very helpful and insightful to understand the thoughts and aims behind this package. I really loved this package as a ML developer and would want to contribute to it ,so I just wanted to understand the aim and philosophy behind it ( so that if I have any ideas in this library it is not out of place of the core philosophy behind it) as it was unlike the deep learning frameworks in python. Again thanks a lot for the detailed answer which helps me understand the deep purpose of this library and its approach.
Will surely contribute in the coming months as I explore it more and train more models.


Have you heard of Fastai.jl? It has more “batteries” included and builds in top of flux:


For beginners I recommend using KNet.jl for Deep Learning over Flux.jl.
It is simpler, more robust, in most cases faster and no issues to replicate things from PyTorch and TensorFlow.

1 Like

Generally speaking, Flux operates at roughly the same abstraction level as PyTorch. In that sense, I don’t think there’s less being “done behind the hood”. One thing to remember that the main Flux repo is not a monolith: core functionality is encapsulated in libraries such as NNlib.jl for better modularity and re-use across the community. Contrast with most Python ML frameworks which are effectively their own, isolated islands.

That makes sense to me, but is it because of the available functionality or other factors (abundance of help resources, overall popularity, etc.)? We literally have a tracking issue to compare against PyTorch, so if you see any glaring omissions please let us know.

Compared to something like (tf.)Keras, both Flux and PyTorch would be “low level”. However, here too there are good options. Want a sklearn-like interface? Check out GitHub - FluxML/MLJFlux.jl: Wrapping deep learning models from the package Flux.jl for use in the MLJ.jl toolbox. Want a Lightning-like framework? Check out GitHub - FluxML/FastAI.jl: Repository of best practices for deep learning in Julia, inspired by fastai like others have mentioned.

It’s great to have new folks willing to engage with the project :slight_smile: . I hope the points above have helped clarify some of your questions. I think the main take home is that while the API is not that far off what you’d expect in Python land, the philosophy of what code goes where is and that comes from the broader Julia ecosystem. This is the key point when working with ML in Julia: instead of asking “is there X for Flux/Knet/MLJ/etc?”, ask “is the existing library for X compatible with AD/GPU/MLJModelInterface/etc?”.


I agree with the sentiment I read from this thread. Coming from pytorch/tf to flux can feel like a bit of culture shock since it seems so minimal. Lots of functionality in Julia comes from composability of the packages. In python functionality is often well documented in the packet’s documentation. Take for example torchvision.transforms — Torchvision 0.11.0 documentation
This documentation describes capabilities of image transformations implemented in torchvision, gives basic usage examples, and refers to the standard PIL. Now of course you can do exactly the same tasks in Julia, but since it’s not core topic to flux the documenation is blank. If you search for image or transform in the docs you won’t find anything simiar. This gives the impression that flux is minimal.

Some suggestions to improve the situation:

  • Integrating the documentation of popular Julia packages, f.ex. Flux and Images.jl, together with simple use-case scenarios in the documentation will go a long way
  • Moving the tutorials to something like a interactive binder (or pluto notebook if that exists) would be great for people to try them out on the fly.
  • A cheat sheet - translating from Keras or Pytorch to Flux may also be helpful.

The idea was to have this documented as part of Flux – Ecosystem. That and Flux – Tutorials + the model zoo for examples of how to “do X in Y” while integrating with other Julia packages. Ideas and contributions for improving the aforementioned areas would be very much welcome.

1 Like

Is there a reason why the tutorials and the description of the ecosystem are not integrated in the documentation? When I’m using the Flux documentation to implement models it would be helpful to have examples from the tutorials and demonstrations how to integrate other packages to work with flux close by. To do this now, you have to leave the flux documentation and go back to the original website. That feels like a break in pace since both have different layouts and docs always opens in a new tab.

I’m looking at this from a user perspective. Demonstrating how Julia ‘includes everything but the kitchen sink’, through all-inclusive documentation, would show new programmers how to approach composability in Julia.

Going further, I would really like to hear from people who tought courses using Julia. What were conceptual difficulties when the student’s approached Julia? And how can this language be made more approachable?


Likely a lot of the creakiness of the documentation/tutorials is from lack of support/PRs for making the docs better. Making good documentation is really difficult and I’d say most of the OSS community much prefers focusing on code contributions (myself included). I think all of the changes you propose would be amazing for the ecosystem and I’m sure PRs would be more than welcome! And blogs, tutorials, and demonstrations of how to use the julia ecosystem is always useful.

Part of the issue is that Flux.jl is still not in v1.0, and some systems are ripe for some radical changes. This may leave supporting any new documentation/tutorials a large undertaking as interfaces can change rapidly until v1.0 comes.

I have only taught (really just a TA focusing on assignment creation) one course in Julia, but have mentored a bunch of ppl on using julia for research (I use it as my daily driver as my projects aren’t well suited for libraries like pytorch/tensorflow and Jax didn’t exist when I was moving from C++). The main conceptual hurdle is always “object-oriented” vs “multiple dispatch” as core design principles. This is mostly gotten through with examples, the julia documentation, and getting practice designing apis using multiple dispatch. I’ve found it is pretty easy for students to learn not to rely on a single package having everything they need once they make the realization that everything flux/most julia packages do is held up by the main language (or another package sritten mostly in julia), and not an obfuscated c layer they can’t see.

None of these are really specific to Flux.jl though, and are just a part of learning a new language.


It should be easy to link the tutorials in the website from the documentation. Documenter supports linking to external sites through the right pane iirc. As long as we make discoverability of the tutorials easier, we can point many users to the right place we can improve the narrative around “where to find X” IMO.

1 Like