For example, I have an input which is a length-10 vector. I want the first neural network to output a length-3 vector based on the first 5 elements of the input, and the second neural network to output a length-4 vector based on the last 5 element of the input. The overall output should be a length-7 vector which combines the outputs of the two neural networks. What’s the easiest way to do this? Though this sounds strange, the external package I use only accepts a “single” neural network in its API, so I cannot just use the two networks separately.
One way is to use Parallel
, applying vcat
after the two paths, and some function to split the input before:
julia> mysplit(x::AbstractVector) = (x[1:5], x[6:end]);
julia> model1 = Dense(5 => 3);
julia> model2 = Dense(5 => 4, x->x+100);
julia> model = Chain(mysplit, Parallel(vcat, model1, model2))
Chain(
mysplit,
Parallel(
vcat,
Dense(5 => 3), # 18 parameters
Dense(5 => 4, #5), # 24 parameters
),
) # Total: 4 arrays, 42 parameters, 376 bytes.
julia> model(ones32(10))
7-element Vector{Float32}:
-1.6361262
1.4683602
-0.70261115
100.29568
100.38945
100.91372
100.26898
julia> mysplit(x::AbstractMatrix) = (x[1:5, :], x[6:end, :]);
julia> model(ones32(10, 3)) # now accepts a batch of inputs
7×3 Matrix{Float32}:
-1.63613 -1.63613 -1.63613
1.46836 1.46836 1.46836
-0.702611 -0.702611 -0.702611
100.296 100.296 100.296
100.389 100.389 100.389
100.914 100.914 100.914
100.269 100.269 100.269
1 Like
The only concern would be the amount of memory allocation involved. Can the input vector be split into two allocation-free views to be fed into the two NNs? Can I pre-allocate a single output vector and ask the two NNs to each write to a specific part of the output?
Such optimisations are not impossible, but usually any nontrivial model1
& model2
will allocate far more than these simple steps anyway.
1 Like
Not in Flux, but in the NN module of BetaML you can use ReplicatorLayer
and GroupedLayer
to obtain a multi branch neural network:
Perhaps you can use the same approach in Flux…
1 Like