the function gradient must know what you would like to take the gradient with respect to. Typically, you’d like to take the gradient wrt the parameters of the model, which Zygote let’s you do by simply passing the entire model.

You could also take the gradient wrt, for instance, the input x if you would like.

restating, your reply clarifies that giving ‘model’ as the explicit argument of the function tells zygote to take the gradient with respect to the parameters in model.

But what then is the purpose and meaning of the ‘model’ parameter that is passed gradient, this one
g = gradient(model -> mse(model.(x),y), model)

Probably a different example would be better to explain. Suppose the function has two arguments, h(x,y) = x+y
How do you get the gradient with respect to y?

What you write means the following g = gradient(z-> mse(z.(x),y), model)
You take the gradient of the anonymous function with respect to its input in the point model