A implementation of ResNet-18 uses lot of GPU memory

OTapio · March 30, 2020, 3:31pm

I did couple of experiments and it seems that the ConvTranspose seems to be culprit here. Just compare performances with these two models

m = Chain(
  ConvTranspose((n, n), 3 => 3, stride = n),
  Conv((7,7), 3=>64, pad = (3,3), stride = (2,2)),
  MeanPool((7,7)),
  x -> reshape(x, :, size(x,4)),
  Dense(512*32, 10),
  softmax,
) |> gpu

m = Chain(
  Conv((7,7), 3=>64, pad = (3,3), stride = (2,2)),
  MeanPool((7,7)),
  x -> reshape(x, :, size(x,4)),
 Dense(256, 512*32),
  Dense(512*32, 10),
  softmax,
) |> gpu

Topic		Replies	Views
Memory challenges for Flux on Resnet Machine Learning gpu	8	1371	September 7, 2022
Flux + GPU memory problems Machine Learning flux	2	817	April 26, 2022
Flux memory usage high in SRCNN Machine Learning	3	192	June 5, 2024
Flux runs out of memory Machine Learning memory-allocation , flux	25	4305	June 1, 2023
Memory usage increasing with each epoch Machine Learning cuda , flux	18	702	April 14, 2025

A implementation of ResNet-18 uses lot of GPU memory

Related topics