Flux 3, now with 100% more Julia!

Flux v0.3 is a big release, with a brand new, pure-Julia backend for machine learning. This lets us immediately support all of Julia’s most powerful features, without having to compile them to a more limited backend like MXNet. It’s also a huge simplification, with a codebase that’s half the lines and much more straightforward.

Going forward, it will be much easier for us to apply our own optimisations and integrate with Julia’s rapidly growing GPU ecosystem. The housing data example shows how easy it is to drop CuArrays (or GPUArrays.jl) into a model for an easy speedup, with more integration coming shortly.

Of course, you don’t need a GPU to get started, and trying it out is as simple as


Check out the new docs and enjoy!


Is there a model for CNN?

1 Like

Not just yet – CNN support is a work in progress, but should be pretty easy to add. I’ll finish it up in the next couple of weeks if no one beats me to it with a PR :slight_smile:


Compare & contrast Flux with Knet?

The main difference right now is that I’ve written Flux to avoid some issues with Knet’s current interface. This enables it to provide a higher-level Keras like API (stacking of layers, reusing common layers like LSTM etc). Knet is also looking to fix those things too, and me, Deniz and Jarrett have been working together to figure out the One True Interface, so things will converge in future.

Longer term, Flux will focus on traditionally “static” graphs and optimisations – e.g. pre-allocating memory, scheduling, parallelism and smart batching. The eventual hope is that you can mix and match these different approaches without making the traditional tradeoffs.

Knet is of course pretty robust, full-featured and performant, whereas Flux is still in alpha, so that’s another important consideration if you just want to get things done right away.


I just tagged 0.3.2 with new GPU support. Enjoy! Just don’t benchmark it just yet :wink:


I didn’t see anything about how to compile Julia such that one can use CuArrays, is there documentation on this somewhere? I’m also a bit confused about the implementation details. For instance, how do I tell it to use MXNet or TensorFlow as a back-end?

For using CuArrays you need a source build of Julia (see https://github.com/JuliaGPU/CUDAnative.jl#installation)

Oh, I figured Julia would need some special compile flags or something.

In this version I’ve removed the TF/MXNet backends as I want to focus on making native Julia support really good. I plan to bring them back in some form in future, likely as a separate package.


Thanks! I will be interested in testing your CNN model.

Are convolutional layers still in the works?

Currently we have implementation of convolutions for both - CPU and GPU, but they are not merged yet.

For CPU, we have an efficient implementation in C++ for Linux, but not for Windows. We are looking for options for compiling it on Windows or implementing them in pure Julia. See FluxML/NNlib#2 for full discussion.

For GPU, there’s implementation in CuConv.jl which I’m going to merge into CUDNN.jl after a few more important functions from NVIDIA’s API are implemented.

@dfdx: From reading those threads it looks like Flux and Knet are sort of converging - is that correct interpretation?

Flux and Knet use a lot of common code - something most Python frameworks fail to provide - but their philosophy is still different.

@dfdx, Could you elaborate on that?

Flux is simpler and more layer-oriented. You have a set of predefined layers that you can chain together to obtain architecture you want. Here’s an example of RNN from Flux docs:

x = rand(10)
h = rand(5)

m = Flux.Recur(rnn, h)

y = m(x)

Knet is more low-level, but gives you more control. The analogous example from Knet:

function predict(w, s, x)
    x = x * w[end-2]
    for i = 1:2:length(s)
        (s[i],s[i+1]) = lstm(w[i],w[i+1],s[i],s[i+1],x)
        x = s[i]
    return x * w[end-1] .+ w[end]

Of course, you can build more high-level functions in Knet or implement your own layers in Flux, but that’s not your normal workflow - you just choose what fits your mental model better. That’s what I call “different philosophy”.

There’s also some technical differences. For example, Knet doesn’t work on Windows and (presumably) requires CUDA which may not be available on your machine. Also I believe these libraries use somewhat different approaches to automatic differentiation.

Note, that I’m not (much) involved in development of any of these libraries, so it’s better to consult with their authors if you want to get more information. For me, it’s just two different libraries, and two is definitely better than zero :slight_smile:


Thank you so much! I love Flux’s straight-forward, elegant manner without sacrificing flexibility. Looking forward to v0.3 and CNNs! Cheers :slight_smile:


Just released Flux 0.4 with Conv2D support. Working on CPU now and GPU will follow shortly. Enjoy!