I’m pleased to announce the early alpha release of Flux, a Julia interface for machine learning.
Flux gives you the best of both worlds. Like Knet or PyTorch, Flux code is easy to reason about; it behaves like Julia, you get control flow, good error messages and stack traces and can even step through models with Gallium. Unlike those frameworks, Flux can still compile models to TensorFlow or MXNet in the background, meaning you don’t have to sacrifice state-of-the-art performance.
Those features – combined with intuitive mathematical syntax and first-class recurrent models to sweeten the deal – mean we hope that Flux can become a great pedagogical tool and even the best way to explore complex new architectures.
We have some way to go, but this is a solid start, and you can check out what works so far in the docs. Over the coming weeks we’ll have blog posts with more details and examples on what Flux can do. Enjoy!
Awesome, Looks great! I can’t wait to try it out!!! I am trying to learn sequence-to-sequence training on an RNN, any example code for getting started on that? Any plans to include an example in Flux?
@TravisA9 As it happens RNNs are something I want Flux to be really good for, as they’re typically harder to do in existing frameworks. We have a char-rnn example here which you should be able to run out of the box, as well as a walkthrough of that example here. Over time we’ll have examples of more general RNN structures like encoder-decoders, so stay tuned.
It’s fun to train char-rnn on the Base Julia corpus and use it to generate Julia code. There’s an example output here and it comes up with some fun names I can also send a trained version of that model if anyone wants it.
My goal is to make this all very easy to work with, so please let me know if you run into any issues and I’ll do my best to get you up and running!
@Tem_Pl So what I actually meant here was the kind of “high-level” control flow that, for example, TensorFlow gives you with tf.while and tf.if; that is, choosing what parts of the network to apply to data, looping over data sources etc. That’s fairly easy to do on its own.
The other side is doing things like writing a custom kernel using devectorised Julia code and seamlessly plugging that into a model, which I think is more what you’re referring to. Julia is obviously uniquely well suited to doing this, although a few pieces need to fall into place first.
There’s a lot of exciting work going on in this area; for example, CUDAnative.jl and GPUArrays.jl both give Julia access to GPUs and I believe the former can already be used to write custom kernels for MXNet. Alongside that, we have projects like ReverseDiff.jl for optimisation and autodiff, and Dagger.jl for distributed parallelism. Basically, I don’t think it will be too long before TensorFlow-like libraries can be assembled from modular pieces of the Julia ecosystem, with all the flexibility and power that brings.