A implementation of ResNet-18 uses lot of GPU memory

I’m using RTX 2070. It took me 6min in total. My Tensorflow output in full can be seen here https://pastebin.com/qa1Zgft3

If I use same GPU metrics as you have above I get following results

When I use Tensorflow my GPU metrics are as follows

I was given a hint that this might help me https://juliagpu.gitlab.io/CUDA.jl/development/profiling/#Application-profiling-1 . I haven’t had time to try it properly.