State of machine learning in Julia

The first one was extremely simple, but it was super hard to debug. Basically Zygote couldn’t provide a gradient for the sqrt function, since I was applying it to a matrix with zeros in it. Here is a GitHub issue with more details. Irrespective of the technicalities, sqrt is such a common function that having a bug on it will surely impact a large amount of users.

The other (and current) one is more obscure, and still under investigation. For some reason Zygote is giving me gradients that are all zero (which should probably just error), while another AD library (ReverseDiff) is working. The problem is that I need Zygote for my model to be able to backpropagate with acceptable speeds. ReverseDiff is just too slow for my case.

These bugs are problematic for the average user because (1) Zygote is pretty hard to debug, and (2) they require very specific skills which only the library developers have. Luckily, everyone in the Julia community is extremely helpful and nice. But I really wish I could handle more things by myself just by reading a more complete documentation or by having more meaningful errors.

8 Likes