Need help getting started with FluxML

Hi all,

I am just getting started with Flux.
While the documentation is present, I have a hard time finding some information.
In particular, these are some questions that came up:

  • am I to use Zygote or the Tracker module? Which one is used for the latest Flux, has Flux already transitioned to Zygote?
  • are there examples for writing my own “layer”? More precisely, I would like to use a SVD and a derivative that I found in some paper.
  • Somewhat more detailed question: I have an operation Diagonal(w)) * A, where w is an Array and A is an Array (a matrix). This works fine without Flux, but if w is containing tracked values, Julia complains that it does not have an implementation for the multiplication operator (*).
    I’m sure I’m doing something wrong that is obvious for the Flux expert.

… I know how to get this done in PyTorch, but I would love to try out Flux.

The current stable release of Flux uses Tracker.
The master branch of Flux in development uses Zygote.
I would not recommend trying to use that right now, for someone just starting out.
(Its just kinda fiddly to setup)

Docs on that are here:

Can you write up a minimum verifiable example,
and include that and the full error message in your post.

1 Like

Thanks for answering!
I saw the basics/building layers documentation; what is unclear to me is how I can supply my own implementation of a derivative. Is that somehow linked to the forward() function? Sorry if this is obvious to some :slight_smile: I’m really new to Flux and Julia.

As for a an example, I’ll try to de-solder the necessary code from my (messy) experiments.

Ok looks like I figured out the part with the Diagonal.
I was actually using a function somewhere that was explicitly creating an Array{Float32,2} and trying to use tracked float values to fill it. There was no conversion function from tracked to untracked, which I assume makes sense.
I fixed my function.

Now the actual problem seems to be that I want to write my own layer, where the function itself returns the SVD (or a value computed from it), and the gradient is something that I found in the literature. To put it simply, I need to implement a function and an accompanying derivative so that the AD system (Tracker?) can make use of it when computing gradients.
How does that work in Tracker, and where would I look to find information about that?

Once again, thanks a lot for your help!

Its here:

Aside: i am super hype about the general idea of putting SVD in NNs,.


Cool thanks; that helped a lot already.

I´m still having some difficulties getting it to work though.
I figured a few things out, but I can’t find out how
to concatenate tracked Float numbers into a tracked array.
Is that possible?
I have a function that computes scalars which are tracked floats, but I want to fill a tracked array with those numbers.

Another things I was wondering about is this: Is it possible in Flux to concatenate tracked arrays and regular arrays into a new array
that would then have to be partly tracked and partly untracked?