Flux came a long way

I was following the Keras presentation at the Google IO 2021 by
Martin Gorner and François Chollet (creator of Keras) about convolutional varitional autoencoder (VAE). As a way to learn about VAE, I re-implemented the notebook in Flux.jl.

I run into some small issues about asymmetric padding (CuDNN: Support for asymmetric padding? · Issue #128 · JuliaGPU/CUDA.jl · GitHub).
My work-around was to do some manual cropping.

In the end, I got equivalent results than the Keras implementation from François Chollet (the model has exactly the same number of parameters, and results are very similar, but not exactly the same as the network is initialised randomly).

But to my surprise, Flux.jl (2.6 second per epoch without compilation) was about twice as fast as Keras with Tensorflow back-end (5.05 seconds per epoch) on the same hardware (GeForce RTX 3080). The Keras notebook shows two ways to implement a VAE, but both have the same performance. It is clear that MNIST is a very small dataset and it would be interesting to know the performance with larger datasets.

As I posted before some issues with reaching similar performance between Flux.jl and Tensorflow, I thought this would be an interesting follow-up.

Here is my Flux.jl code

Here is the Keras code:

Let me know in case there is a problem with my Flux.jl code, but at least the Keras implementation should be good! :grinning:

Kudos to Flux and CUDA developpers!

41 Likes

Nice to hear. I also tried Flux recently and it was pretty decent. But then i was just doing a GLM and the glm in R is fast enough

1 Like

I am wondering if GLM in R uses the GPU, or is a CPU fast enought for your case?

Cpu. But the CPU one is fast enough. Same speed as my gpu implementation. Although for such a small dataset. GPU is not necessary