Hello and welcome!
There are lots of reverse-mode AD libraries in Julia, you could try your hand with Zygote.jl which is the AD used in the DL library Flux.jl
using Zygote
Zygote.gradient((W1, W2) -> f(W1, W2, x), W1, W2)
This creates a closure (anonymous function) (W1, W2) -> f(W1, W2, x)
that closes over the variable x
, so that you only compute the gradient w.r.t. W1,W2
.