Is there any difference in the time taken by a julia code and python code to complete 100 epochs?
For example, suppose I’m trying to do mnist classification for 100 epochs. I wrote the code in python using keras and in Julia using Flux. Will there be any difference in the time taken to complete this 100 epochs by both the code?
In general, this is a complicated question with a bunch of nuance, but here is a TLDR version of an answer.
There are three types of Deep Learning problems:
- Problems which require user code/control flow to solve
- Problems where efficient vectorized solutions are well known,
- Problems where efficient vectorized solutions exist,
The first is where Julia shines. If you wanted to run something like Alphazero, python wouldn’t be a great language for it since generating the training data (the bottleneck) would be an order of magnitude slower. This also applies to lots of research in areas like neural PDEs, and some other areas.
For the second case, you will likely not see much difference. In these cases, both Julia and keras will be using libraries like cublas which are doing precisely the optimal actions, and everything will be roughly a wash.
The third case is interesting and complicated. A performant method may exist in a python framework. If it doesn’t, it will likely be easier to implement one in Julia. If it does, however, it is possible that Julia will not be as fast due to the fact that ML in Julia is much newer and as such, not always as mature.
Overall, the places where Julia is slower than python are generally treated roughly as seriously as bugs, and are always fixable (although sometimes fixes take a while; fixable doesn’t always mean easy). Over time, I would expect the number of cases where Julia is slower than Python to drop, and the cases where it is faster to rise. This is because Julia at it’s core is fast, so ML in julia doesn’t work around the language, but embrace it.
To add to the previous answer, performance also depends on exact library, exact model and even stage (training / inference). Here are some of my recent benchmarks - note how different frameworks perform better in different settings.
Thanks you for your link, actually it is very interesting your Deep Learning and a good benchmark, and it is better clearly in CPU.
To response to @Eldho.Ashna, in my tests using Images (so CNN networks) at the moment Keras and PyTorch are lightly more competitive against Flux in GPU, but the different is not too much even with PyTorch, that I think is the fastest. Actually I prefer simplicity of Flux API (and then PyTorch). Keras has a good documentation also but the API of PyTorch it seems better (more Pythonic).
I recommend to try it, and see for yourself.