I apologise if this is not the area to ask Flux questions. If not, please direct to the correct place.
I have been going through the Flux.jl documentation and came across this code.
julia> x = [2, 1];
julia> y = [2, 0];
julia> gs = gradient(params(x, y)) do
I understand how the
do end block works, but I am struggling to understand the use and function of
params. I cannot find good documentation on it. If I missed it, please point me towards it!
Thank you for your time.
I actually want to turn this question around and ask you what you found to be deficient with the existing explanation in the paragraph above that code block. This feedback could be useful for improving the docs themselves.
Hi @ToucheSir, thank you for your reply! I accept your turn around
One would be able to figure out how params() works given the context of that specific example, but I would just like to see some documentation on the function itself. Unless I am missing something and being unreasonable?
I’m not an authority on this, but my guess is that
params returns a
Params struct and this is used for dispatch to disambigue the “normal” use of
gradient which assumes that any arguments after the first shall be used as inputs to the first argument (which is always a function).
One possible reason why this is not very well described is that it is considered a temporary and somewhat ugly patch to bridge how Flux used gradients to update parameters in the pre-Zygote days. I think it has survied a fair bit longer than anyone intended/hoped it would.
As for the function
params, it basically just traveres the model structure and searches for arrays with numbers using
Functors.jl and puts them in a
@DrChainsaw hit the nail on the head here. That said, we should add some documentation for this function, even if it just points to the tutorial. Do you mind filing an issue?