I want to train a CNN model on the German Traffic Sign Dataset by adapting one of the model_zoo examples. First I tried training on a GTX1070 GPU with 8GB of RAM and failed with out of memory. After that I tried training on CPU (with 16GB of RAM) and also failed with out of memory.I tried using batchsizes of 64 and 16 with the same problem.
Is this model too complex for my machine?
The model is:
model = Chain(
# First convolution, operating upon a 32x32 image
Conv((3, 3), 1=>32, pad=(1,1), relu),
Conv((3, 3), 32=>32, pad=(1,1), relu),
BatchNorm(32),
x -> maxpool(x, (2,2)),
# Second convolution, operating upon a 16x16 image
Conv((3, 3), 32=>64, pad=(1,1), relu),
Conv((3, 3), 64=>64, pad=(1,1), relu),
BatchNorm(64),
x -> maxpool(x, (2,2)),
# Third convolution, operating upon a 8x8 image
Conv((3, 3), 64=>128, pad=(1,1), relu),
Conv((3, 3), 128=>128, pad=(1,1), relu),
BatchNorm(128),
x -> maxpool(x, (2,2)),
x -> reshape(x, :, size(x, 4)),
Dense(2048, 128),
Dense(128, 128),
Dense(128, 43),
# Finally, softmax to get nice probabilities
softmax,
)