A implementation of ResNet-18 uses lot of GPU memory

I think Julia doesn’t use GPU efficiently. Bit like it spend more time moving data around than calculating it.

gpu_usage_julia_flux

Edit: I did a test run with Tensorflow and got result:

Epoch 10/10 time=5.79 mins: step 7800 total loss=0.8476 loss=0.4077 reg loss=0.4400 accuracy=0.7989

Batch size was 64. Much faster than Flux.