I am just getting started with Flux.
While the documentation is present, I have a hard time finding some information.
In particular, these are some questions that came up:
am I to use Zygote or the Tracker module? Which one is used for the latest Flux, has Flux already transitioned to Zygote?
are there examples for writing my own “layer”? More precisely, I would like to use a SVD and a derivative that I found in some paper.
Somewhat more detailed question: I have an operation Diagonal(w)) * A, where w is an Array and A is an Array (a matrix). This works fine without Flux, but if w is containing tracked values, Julia complains that it does not have an implementation for the multiplication operator (*).
I’m sure I’m doing something wrong that is obvious for the Flux expert.
… I know how to get this done in PyTorch, but I would love to try out Flux.
The current stable release of Flux uses Tracker.
The master branch of Flux in development uses Zygote.
I would not recommend trying to use that right now, for someone just starting out.
(Its just kinda fiddly to setup)
Thanks for answering!
I saw the basics/building layers documentation; what is unclear to me is how I can supply my own implementation of a derivative. Is that somehow linked to the forward() function? Sorry if this is obvious to some I’m really new to Flux and Julia.
As for a an example, I’ll try to de-solder the necessary code from my (messy) experiments.
Ok looks like I figured out the part with the Diagonal.
I was actually using a function somewhere that was explicitly creating an Array{Float32,2} and trying to use tracked float values to fill it. There was no conversion function from tracked to untracked, which I assume makes sense.
I fixed my function.
Now the actual problem seems to be that I want to write my own layer, where the function itself returns the SVD (or a value computed from it), and the gradient is something that I found in the literature. To put it simply, I need to implement a function and an accompanying derivative so that the AD system (Tracker?) can make use of it when computing gradients.
How does that work in Tracker, and where would I look to find information about that?
I´m still having some difficulties getting it to work though.
I figured a few things out, but I can’t find out how
to concatenate tracked Float numbers into a tracked array.
Is that possible?
I have a function that computes scalars which are tracked floats, but I want to fill a tracked array with those numbers.
Another things I was wondering about is this: Is it possible in Flux to concatenate tracked arrays and regular arrays into a new array
that would then have to be partly tracked and partly untracked?