Batching in Flux 0.11.1

Hello,
for a dependency issue, I have to use Flux at version 0.11.1.

I cannot understand how to use batches:

julia> using Flux

julia> d = Dense(5, 2)
Dense(5, 2)

julia> d(rand(Float32, 5, 64)) |> size
(2, 64)

julia> d(rand(Float32, 5, 1, 64)) |> size
ERROR: MethodError: no method matching *(::Array{Float32,2}, ::Array{Float32,3})
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...) at operators.jl:529
  *(::ChainRulesCore.NotImplemented, ::Any) at /home/sapo/.julia/packages/ChainRulesCore/qbmEe/src/d
ifferential_arithmetic.jl:27
  *(::ChainRulesCore.ZeroTangent, ::Any) at /home/sapo/.julia/packages/ChainRulesCore/qbmEe/src/diff
erential_arithmetic.jl:120
  ...
Stacktrace:
 [1] (::Dense{typeof(identity),Array{Float32,2},Array{Float32,1}})(::Array{Float32,3}) at /home/sapo
/.julia/packages/Flux/05b38/src/layers/basic.jl:123
 [2] (::Dense{typeof(identity),Array{Float32,2},Array{Float32,1}})(::Array{Float32,3}) at /home/sapo
/.julia/packages/Flux/05b38/src/layers/basic.jl:134
 [3] top-level scope at REPL[27]:1
 [4] eval(::Module, ::Any) at ./boot.jl:330
 [5] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux64/build/usr/sha
re/julia/stdlib/v1.3/REPL/src/REPL.jl:86
 [6] run_backend(::REPL.REPLBackend) at /home/sapo/.julia/packages/Revise/1boD5/src/packagedef.jl:12
21
 [7] top-level scope at REPL[1]:0

I’m not sure I see the connection to batching. What were you intending rand(Float32, 5, 1, 64) to signify? 64 samples with a batch size of 1?

1 batch with 5 samples and 64 features

From the slack channel: this feature was added in Flux 0.12 with this PR.

It’s not compatible with Julia 1.3.1 though, any idea about how to solve this issue? I need to stay on Julia 1.3.1 for now because of Cxx.

(nmf) pkg> add Flux@0.12.4
 Resolving package versions...
ERROR: Unsatisfiable requirements detected for package CUDA [052768ef]:
 CUDA [052768ef] log:
 ├─possible versions are: [0.1.0, 1.0.0-1.0.2, 1.1.0, 1.2.0-1.2.1, 1.3.0-1.3.3, 2.0.0-2.0.2, 2.1.0, 2.2.0-2.2.1, 2.3.0, 2.4.0-2.4.3, 2.5.0, 2.6.0-2.6.3, 3.0.0-3.0.3, 3.1.0, 3.2.0-3.2.1, 3.3.0] or uninstalled
 ├─restricted by compatibility requirements with Flux [587475ba] to versions: [3.0.0-3.0.3, 3.1.0, 3.2.0-3.2.1, 3.3.0]
 │ └─Flux [587475ba] log:
 │   ├─possible versions are: [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4, 0.11.0-0.11.6, 0.12.0-0.12.4] or uninstalled
 │   └─restricted to versions 0.12.4 by an explicit requirement, leaving only versions 0.12.4
 └─restricted by julia compatibility requirements to versions: [0.1.0, 1.0.0-1.0.2, 1.3.0-1.3.3] or uninstalled — no versions left

In Flux, the batch size is last. So, for image data, it follows WHCN format. Also, you only call the model on a batch at a time. So, d(x) assume that x is a batch of data. If you want one batch of five samples and 64 features, then you need

d = Dense(64, 2)
# apply to a single batch of size 5 w/ 64 features
d(rand(Float32, 64, 5))

# create a vector of 10 batches
# each batch is size 5 w/ 64 features
xs = [rand(Float32, 64, 5) for _ in 1:10]

# loop over the batches and apply the model
ys = []
for x in xs
  y = d(x)
  push!(ys, y)
end

# use map instead of a loop
ys = map(d, xs)

# use broadcasting
ys = d.(x)

Also, maybe you already have data that you need to split into batches. You can use the following:

using Iterators: partition

# create a 100 samples w/ 64 features each
X = rand(Float32, 64, 100)

# split into batches of size 5
xs = [X[:, i] for i in partition(1:100, 5)]

Lastly, the methods above are using basic Julia functionality which is usually sufficient for simple arrays of data like X. But for complex datasets, you might want to use data loaders with built-in batching like Flux.DataLoader or DataLoaders.DataLoader.

Yeah if you need to stay on Julia 1.3.1 then you won’t be able to use Flux 0.12. Not much you can do.

1 Like

Basically, what I need is to overload this function with this new version (taken from version 0.12):

function (a::Dense)(x::AbstractArray)
    W, b, σ = a.W, a.b, a.σ
    σ.(W*x .+ b)
    # reshape to handle dims > 1 as batch dimensions
    sz = size(x)
    x = reshape(x, sz[1], :) 
    x = σ.(W*x .+ b)
    return reshape(x, :, sz[2:end]...)
end

How can I do it? In my tests, Julia still prefers the Flux function, not mine

You need to either add import Flux: Dense at the start of your code. Or do (a::Flux.Dense)(x::AbstractArray) in your definition. (you need to override/extend the function in the Flux namespace).

Also, adding that override would make your code run, but I don’t think it will do want you want it to. Maybe I just misunderstood your explanation from before though.