Flux & Float 64 + model deconstruction?

I’m trying to run my past Flux networks, and have a couple of questions.

First, I figured out that with model m and data X, I now find predictions by m(X) instead of the old Tracker.data(m(X)) — great, I like that.

Next to my three questions.

  1. How can I tell the network (or an individual layer) that I want the parameters (weights W, biases b) to be Float64 — if I want to (I’m running the code on the CPU).

  2. How can I “deconstruct” the network and pick out weights and parameters for each layer?

  3. How can I initiate each layer with my choice of parameters? [e.g., those from another, “deconstructed” model…]

I think this exact problem is substantially easier to solve with Flux 0.10, so I’d upgrade to handle this.

1 Like

I should have been more precise: I do have Flux v.0.10 installed. I figured out how to deconstruct a dense/FNN with the past version, but had problems with selecting parameter types then.

So I’m curious about how to deconstruct an FNN (Feedforward = Dense) network under v. 0.10 + specify the parameter type + populate the parameters with values of my own choosing.

Is this documented somewhere?

OK… I see that I can deconstruct the model m by taking out each layer as m[1], m[2], etc.

Next, I can find the parameters as m[1].W and m[1].b.

However, I can not change the parameters by, say, m[1].W = rand(Float64,5,2).

QUESTION: How can I change the values of the parameters?

You can do m[1].W .= rand(Float64,5,2). You’re trying to bind the W parameter of m[1] to a different array, which doesn’t work if m[1] is immutable. .= writes the values into the already existing matrix W which does work because arrays are always mutable.

1 Like

Ah. I should have guessed that. I assume I can also change type, if the original type is Float32, I can change it to Float64?

I don’t have a Julia installation at work (so I can’t test this at the moment), but with Flux 0.10.0 you should be able to do #1 on your list this way:

model = fmap(f64, model)

Flux’s fmap function lets you operate on all the model’s parameters at once, and the f64 function changes things to have a Float64 type. (In Flux version 0.9.0 and earlier, you can use mapleaves instead of fmap for this purpose.)

You can look at the constructors for the particular layers you’re interested in, too- a lot of them, at least, let you pass in an array to initialize the parameters. For example, Dense has the initW and initbkeyword arguments that you can use here (see here). Similarly, Conv has positional arguments for this (here).

These aren’t really obvious to me from the documentation, so I think that could perhaps use a little improvement. But in the meantime, short of reading the source code, you can use methods(Dense) for example at the REPL to see what constructors are available to you.

I hope this helps! :slight_smile:

2 Likes

Nope. Doesn’t work:

…and why not map(Float64,mod)?

Also:
image

So – seems like one cannot change the type after the network has been created…

Then, how can I define the type upon creation of the network?

I believe that fmap returns a copy of the model with the updated parameters, instead of doing it in-place. If you do mod = fmap(f64, mod) are all of mod’s parameters still Float32?

Edit: this seems to work for me, on Flux version 0.10.0:

julia> using Flux
[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)

julia> m = Chain(Dense(4, 4, relu), Dense(4, 4), softmax)
Chain(Dense(4, 4, relu), Dense(4, 4), softmax)

julia> m[1].W
4×4 Array{Float32,2}:
  0.823812    -0.593816   -0.799553  -0.570861
 -0.242723    -0.529529    0.012944   0.740021
  0.00234743   0.542825   -0.627849  -0.746003
  0.640687     0.0562518   0.183272   0.300056

julia> m = fmap(f64, m)
Chain(Dense(4, 4, relu), Dense(4, 4), softmax)

julia> m[1].W
4×4 Array{Float64,2}:
  0.823812    -0.593816   -0.799553  -0.570861
 -0.242723    -0.529529    0.012944   0.740021
  0.00234743   0.542825   -0.627849  -0.746003
  0.640687     0.0562518   0.183272   0.300056

And, in fact, I’ve just learned that you can skip the fmap altogether and just call f64 on the model directly. :slight_smile:

julia> m = Chain(Dense(4, 4, relu), softmax) |> f64
Chain(Dense(4, 4, relu), softmax)

julia> m[1].W
4×4 Array{Float64,2}:
 -0.378957    0.74928    0.776147   0.804017
 -0.298902   -0.494922  -0.519403  -0.668515
  0.0772961  -0.219601   0.776379  -0.067537
  0.64462    -0.259356  -0.697054  -0.218149
3 Likes

Thanks – first one worked for me, too (m = fmap(f64,m)). I assume the other also works – i.e., when instantiating the network! I’ll check when I get my laptop up and going.

Additional question… the fmap method is introduced in the documentation as fmap(cu,m) where m is the model, and I understand cu is meant to move the computations to a GPU??

Can this be used to run Flux on the NVIDIA 1050 GPU of my laptop – just to experiment with things?

Pretty much, yep! I don’t have a whole lot of experience with Flux on the GPU, but I think that it should work as long as you have CUDA/CUDNN installed. The gpu function is for moving models/data to the GPU: you can see here that it really just calls cu if there’s a GPU available. And, like with f64, you should be able to skip the fmap if you want to:

julia> model = Chain(Dense(10, 3), softmax) |> gpu

You’ve probably seen it, but there’s some documentation here.

1 Like