I agree that if you want to really learn how things work you should go with either KNet or Flux. TensorFlow and MXNet have really complicated code bases. One of the wonderful things about Julia is that the need for a big, complicated machine learning framework with a huge code base pretty much disappears. I’m not too familiar with how KNet works, but Flux is pretty much just normal Julia objects collected in a convenient place. It’s really a thing of beauty next to Tensorflow and MXNet, and I hope that the machine learning community comes to love its extreme simplicity in the years to come.
Despite all the hype, deep learning (and to some extent, machine learning more generally) is based on a quite simple, and quite old idea: just take some function with a huge number of parameters and try to fit it to your conditional distribution. Naively it’s somewhat surprising that this is a viable approach, the central “miracle” behind it is explained here. The rest of the subject involves finding the appropriate ansatz to fit to (e.g. multi-layer perceptron, the whole zoo of convolutional nets, recurrent nets). You may be disappointed to learn that generally the people coming up with all the clever little variations don’t really seem to know why they work better or worse than any others (in many cases I would imagine this would require a deeper understanding of what generates the underlying distribution), so as a practical matter working on deep learning involves a lot of experimentation and just doing “whatever works”. For this you’ll probably be much happier with something like Flux than with a behemoth like Tensorflow, for which you’ll constantly need to dig through reams of documentation.
(Since discourse pinged me telling me that so many people clicked on the IB paper link, I wanted to make sure the interesting rebuttal paper kindly linked by @dave.f.kleinschmidt was also visible in the same place. In my assessment IB likely has a very significant role to play in explaining deep learning success, but it’s early days and it’s good to be aware of all the facts.)