Knet.jl CNN Tutorial Speed

Knet’s tutorial.ipynb trains a CNN with MNIST data. I have no GPU, but my quad-core I7 has been pegged for a day stuck on cnnmodels = [ copy(train!(wcnn, dtrn)) for epoch=1:100 ];.

Is this exceptionally long training time to be expected?

See https://github.com/ilkarman/DeepLearningFrameworks for some benchmarks. Knet CNNs were never optimized for cpus. This is not a high priority because you can’t train anything serious on a cpu. Amazon has gpus for 10-20c an hour on spot price if you want to experiment.

Well, I suppose “serious” can mean different things to different people. For a student just being introduced to ML, “serious” is being able to complete simple graph ML homework on his/her laptop.

I agree. I wrote the current cpu kernels for such students. All code that
works on gpu should work on cpu in Knet (which usually takes a lot more
effort on our part, replicating all the functionality of cudnn kernels with
barely any documentation - just spent last two weeks trying to replicate
cudnn rnns on cpu). However making this fast and efficient is a different
matter. Can’t train a state of the art image recognizer or machine
translation model on cpu even with better kernels in a reasonable amount of
time. So not sure there is high motivation for efficient cpu kernels. That
said if somebody writes them I’d be happy to integrate :wink: