Flux vs Knet for research and production

I’m new to AI in julia, and to AI in general, but still want to make the best decision or at least know the reason for choice. What are the advantages and disadvantages of both packages for production and also research. I have heard and seen on some old blogs and reddit that Flux is more Keras like, does that limit in in any way?
Thank you for advice.

I find KNet.jl to be much better than Flux.jl.
While the latter get all the attention and hype, KNet.jl just works.
I find it simpler and more stable. It doesn’t try to solve bigger problem than being a good DL framework.

As far as I know, Flux and Knet share a lot of code through CUDA.jl and nnlib.jl, so to me the biggest difference seems to be the AD engines, and which API fits your personal preference more.

For my research Flux with Zygote.jl is ideal since Zygote can differentiate pretty much any Julia code and works with ChainRules.jl (which makes it easy to implement custom surrogate gradients for e.g. binary neural network or feedback alignment training). As far as I understand Knet, which uses Autograd.jl, can only backprop through functions with a predefined gradient (see the section “extending autograd”: https://github.com/denizyuret/AutoGrad.jl), so it is a bit less flexible. Depending on your use case this may or may not matter :slight_smile:

In terms of performance I think Knet used to be faster but I am not sure if that is still the case now that they both rely on CUDA.jl.

One other option is Avalon.jl, less talked about, and BetaML.jl (more for learning/pedagogical). See also: IBM’s AutoMLPipeline.jl and Lale.jl…

If you’re new to AI, then it’s broader than machine learning (ML)/deep learning/neural networks. But for that, if you’re new, I hate to say this, but I might want to start learning with Python (I would likely do that, then switch to Julia; my knowledge is more theoretical… see my threads here, e.g. under off-topic and the ML category on what I know is being done, or not done with Julia). People are doing research into with Python (the heavy lifting though with C++, what researchers may not need to see/know).

If you want however Scientific ML, SciML, then Julia is the go-to language. But SciML is an exception/non-traditional ML, and the main guy behind it, Chris R, isn’t sure Julia has a killer advantage for regular ML, speed-wise (and probably still catching up feature and doc wise, possibly fast).

ML is a broad area too, e.g. clustering is part of it and random forest, and other decision trees etc. Those may be as good in Julia as elsewhere. I’m not sure what’s clearly better in Julia currently, for production or research, except for SciML, and Pumas.AI based on SciML in production.

Reinforcement learning (RL) is yet another part of AI. Again most work elsewhere, while also done in Julia.

Peter Norvig, who wrote the book on AI, literally (and is director at Google), would want Julia as the main language for AI. But it isn’t yet, If I recall LeCun (at Facebook AI), another big name, also has noticed Julia.

Deepmind, is behind most state-of-the-art (SOTA) AI/ML research, and has no Julia repos on its Github (IBM and Microsoft have some, and SLIM GROUP has lots, interesting research), and neither does Salesforce (or Numenta, that I would watch for), that I wouldn’t have thought of for AI, but they have something (transformer-based) SOTA recently:

https://github.com/salesforce/BLIP

Transformers seem to be the future for almost any NN, taking over from CNN neural networks, while CNN or hybrid of, seems also coming back. Also new dendric networks, to combat “catastrophic forgetting”, biologically plausible, based on pyramidal neuron cells, see recent work from Numenta. Spiking neural networks are not yet mainstream (while the brain is that type), but there’s some such work done in Julia, likely research. I’ve not heard of any such networks used in production, might be wrong.

Can you explain what do you mean with

As, @Rasmus_Hoier has said

Is that true?
Which one is more flexible?

That’s the exact point. Flux.jl + Zygote.jl tries to be very flexible and general.
In practice, it means more corner cases.

KNet.jl is more focused and does what it aims to do well.
If what’s implemented in KNet.jl is enough for you, go with it.
If you need ultra flexibility and can pay for it with some corner cases, go with Flux.jl.

Thank you for answering, just one question. I know that in engineering corner cases are generally malfunctions when multiple parameters reach extremes, how does that manifest here? Does that mean the code may not work when it has little resources or is it meant that if too many neurons fire(give high values) it may break?
I think I’m misunderstanding that completely. Is it possible to do research with Knet.jl as it seems(at least to me) that it is more for production.

For me, KNet.jl is like the early PyTorch. As long as you use the given building block everything works. It is more limited, but more well defined.
So when you say research, if you can use the given building blocks, then it suits you. If you need total freedom, it is less.

I think you can see it from the design goals and choices of each project.
By the way, if I remember correctly, Avalon.jl has the same approach as KNet.jl in that reagrd.

Thank you that makes sense, I don’t need freedom but want it, as I don’t like being constrained. Plus it seems interesting to be able to see what doesn’t work.
Thanks again for help.