Do you actually use FluxML to get work done?


Hello. I am a student and I am studying tensor methods for deep learning. I decided to try Julia with FluxML yesterday, and so far I met 2 problems of type “FluxML tracked array doesn’t compile when used with this standard library function” and also a more sneaky and evil problem of type “it compiles and runs, but the computed gradients are incorrect”.

This experience suggests, that FluxML is very raw and not ready for people to work using it. But that’s not the impression I got from reading this forums. So… if you actually do stuff using FluxML, not for the purpose of tinkering with FluxML, raise you hand and tell us what do you use it for?


These are pretty vague error reports, so there’s very little we can do to help you. If you’re interested in solving whatever problem you’ve run into, you’ll be much more likely to get useful help if you can describe your problem in a scientific, reproducible way.

But, to your question, yes, I did all of the machine learning work for the final year of my PhD thesis in Flux.jl, so it can certainly be used for real work.


I’ve been using it for designing a CNN that is constrained to anatomy in medical imaging. It’s fairly basic because most of my ML experience is in traditional stats methods until now. However, it’s actually neuron to neuron connections are pretty unique so it would have required a lot more work to create the custom layers in Tensorflow.


Which library function?

Did you everything to make it run instead of actually fixing your types? That would obviously break the gradients and it sounds like that’s what happened. Of course, this will happen in any machine learning framework where one bypasses the backpass to fix an error.


I use Flux for AD (not ML, but Bayesian MCMC). Like all software (including software I write), it has issues, but when I encounter one, I make a reproducible example, and report the issue, as this is the best way to get it fixed.

Please don’t take this the wrong way, but perhaps you misunderstand some basic things about free software. Most of these projects are collaborative efforts, and are in constant… flux (sorry, I could not resist :sunglasses:).

There is no clear-cut division between “users” and “developers”, since to make the most of free software, you become a contributor to a certain extent: for example, reporting well-documented issues helps a lot.

Consequently, polling people about whether they find a particular package useful to “get work done” is not really informative, as all the users implicitly accept that they have to get their hands dirty from time to time. They do get work done, and part of that work is contributing to other packages.

If you need reverse-mode AD, I would say that Flux is currently one of the best 3 options in the Julia ecosystem. There are WIP projects that will probably end up being better (Capstan and Zygote), and packages that are quite robust but usually slower (ReverseDiff), but it is a very useful package.


PhillipB, I use it everyday where I work. There are challenges with it, but on the whole it does what I need in the language I prefer to be expedient.


To all the people who want me to describe the problems thoroughly, I had reported them as issues on Fluxml’s github before making this discussion.


Thank you for doing this! :slight_smile:

For myself, I’m using Flux for my own personal research (namely spiking neural networks and dynamic models generation), and haven’t had any serious, non-insurmountable issues so far. I also hope to use it at work at some point since I’ve found it to be quite robust so far.


I use it for my research, no problem.

I find it easier to use than Tensorflow. Did not try Pytorch though, but I like Flux as it is bare bone simple and I can train complex models easily.

1 Like

I use it on pretty large-scale learning in intrusion detection with multi-instance learning. I have tried to do the same stuff before in TF and it was painful. What killed my TF attempts was a poor performance of TF on sparse arrays (last tried two years ago). Thanks to Julia community, we have added a custom sparse-dense multiplication and it was about 5 times faster than TF.

To conclude, yes, I do you Flux in real work and I like it.


In my opinion it’s easier than any of the Torch implementations. Torch requires a good deal of converting/shuffling and reshaping just to do some basic stuff. With Flux I can almost literally write down and equation and have it come to life. Pretty stellar. That being said they need to fix their 1-D convolutions before I can use it again :smiley:


Well, I was working with Tracker last year. Honestly, I don’t like implementing tracked type by subtyping things (sometimes they fallback to wrong things), but it works fine in general and get my work done, tho I ran into a few issues as well.

I’m working with Zygote now, which is much better, (there’ll be a few glitches of course, it a brand new package). But if you have been using pytorch with customised data structures (like complex valued), like me, I’m pretty sure zygote is the best (not better) solution you can find in the world!

1 Like

Are you doing deep architectures with Zygote? I’ve run into some issues differentiating through some custom structs with closures. Which interface of Zygote are you using? I’m trying to integrate Zygote into my Bayesian deep learning package and run into issues now and then that I lack the AD domain knowledge to fix myself.


Would you mind opening corresponding issues on the the Zygote repo? I’d like to switch soon for deep architectures and it would be helpful to start working these kinks out.


No, it’s about differentiating quantum circuits. But you should open issues in Zygote, it’s on a very early stage, we need more people to open more bug reports to stabilize it!


I’m indeed opening issues in the repo. Would like to be able to fix it too but unfortunately 75% of the code in Zygote is black magic to me :joy:

1 Like

Bayesian deep learning, seems promising!

Any ETA? I’m very interested to try Bayesian neural nets on my problems with limited sample numbers, as it has been shown in the literature to generally perform better for small problems, but I just don’t have the time to implement one…