[ANN] Lux.jl: Explicitly Parameterized Neural Networks in Julia

Wound it not be sufficient to do

ps, st = Lux.setup(rng, model)
ps = Float64.(ps)

?

That would work if ps were a flat vector, but Lux uses nested named tuples for its parameters.

Seems that the state st is even more tricky as it contains leafs of different types, e.g., training = Val{true} besides Vector{Float32} parameters. Thus, ComponentVector(st) will not work. On the other hand, as soon as you run the model a single time via y, st = Lux.apply(model, x, ps, st) the state will be promoted to the type of x and ps anyways, so you can use that to convert instead.

The latest stable release has the functions available Utilities | LuxDL Docs

1 Like

I love the style and theme of the website! Getting Started | LuxDL Docs could be more descriptive though, given its an introduction despite focusing on the coding aspect which is alright, I think more useful and descriptive comments could be added.

1 Like

I started looking at Lux based on this thread, and I must say that I am truly appreciative of the documentation. Thanks to everyone who put in the time to work on the package and the website!

In the spirit of giving constructive feedback, there is just one proposal I would like to make. The font used for code blocks makes minimal distinction between a period and a comma. One example from the very first tutorial in “Julia and Flux for the Uninitiated” in the section on “(Im)mutability”:

image

If you know any Julia, this won’t be a problem, but this tutorial is specifically aimed at people who don’t, which may result in a little (or a lot of) head-scratching.

For comparison, when I copy and paste the code to my REPL:

image

In any event, I shall be directing my team members interested in learning Julia and deep-learning to the Lux docs. It does an infinitely better job at explaining things than I have been doing.

11 Likes

Some New Major Updates (till v0.5.33)

  1. Lux now has in-built distributed training support via MPI – Distributed Data Parallel Training | Lux.jl Documentation. It is effectively a rewrite of my older package GitHub - avik-pal/FluxMPI.jl: Distributed Data Parallel Training of Deep Neural Networks (that is now archived), but allows NVIDIA GPU communication via NCCL!
  2. SimpleChains is available as a backend (Switching between Deep Learning Frameworks | Lux.jl Documentation) for small neural networks, which means you can write everything in Lux and still use SimpleChains with just 1 additional line of code (MNIST Classification with SimpleChains | Lux.jl Documentation)
28 Likes

That’s awesome! IIRC, one of SimpleChains.jl’s main selling points is that it can be allocation-free: how does this pair with Lux’s purely functional style?

2 Likes

It still maintains the purity in the sense of no-side effects and same inputs => same outputs. But yeah you can’t get full performance of simple chains staying in the lux api or for that matter using the chainrules api. FWIW most simple chains users from sciml point would use the chain rules api (See Faster Neural Ordinary Differential Equations with SimpleChains · SciMLSensitivity.jl) and not the fully non-allocating train_batched! API.

Lux API is simpler to use and more people are familiar with it (since we mimicked the flux api for layers). SimpleChains is extremely fast but the API is somewhat non-traditional. So now users can write the model in Lux and (if possible) convert it to simple chains (getting the performance boost) and never have to learn the details of how to write in SimpleChains

1 Like

Perhaps SimpleChains would be a good candidate for a DifferentiationInterface binding, which could then be accessed from Lux

1 Like

New Additions (v0.5.36)

ude = @compact(; nn, p_true, solver=Tsit5(), tspan=(0.0,1.0),
        kwargs...) do x, ps
    # Just an arbitrary UDE
    dudt(u, p, t) = nn(x, p.nn) .+ sum(p.p_true)
    prob = ODEProblem{false}(dudt, x, tspan, ps)
    return solve(prob, solver; kwargs...)
end
12 Likes

20 posts were split to a new topic: Nested AD with Lux etc