Right sorry, it’s been a minute since I last worked with Grad-CAM. Since you only need the gradient wrt. a given layer’s activations, the easiest way to do so would be to compute those first and then compute the loss with them and the rest of the model. e.g:
acts = m_upto_somelayer(x)
grads = gradient(a -> loss(m_after_somelayer(a), y), acts)[1]
Since you’re only looking for the gradient of one parameter here, passing it in explicitly per Basics · Flux is easier than using params
.