Hello!
I am trying to create a neural network that is mostly standard, except with an output layer that applies a sigmoid activation function to one of its four outputs, and just identity to the other three outputs.
I made a custom activation function following this thread:
custom_final_activation(x::AbstractArray{Float32,2}) = vcat(
NNlib.sigmoid(x[1:1,:]), NNlib.identity(x[2:end,:]) # PROBLEMATIC?
)
and included this as my final layer to my NN model (I have the full, (not)-working minimal example at the end of this post):
# append final layer to on-going list of layers
push!(layers, Dense(n_intermediate, n_OUT, custom_final_activation))
# construct model
model = Chain(
layers...
) |> gpu;
However, when I try to evaluate this model, I get the error
ERROR: LoadError: GPU broadcast resulted in non-concrete element type Union{}.
This probably means that the function you are broadcasting contains an error or type instability.
I suspect that I have defined my custom_final_activation
function incorrectly, but I am not sure where the type instability is coming from…
I would appreciate any help!
Here is a (not)-working minimal example:
using Flux
using CUDA
n_IN = 10
n_OUT = 4
n_intermediate = 128
intermediate_layers = 3
# example data
xs = rand(Float32, n_IN, 100)
ys = rand(Float32, n_OUT, 100)
# build model
layers = Any[Dense(n_IN, n_intermediate, NNlib.relu)]
for i = 1:intermediate_layers
push!(layers, Dense(n_intermediate, n_intermediate, NNlib.relu))
end
# final layer
custom_final_activation(x::AbstractArray{Float32,2}) = vcat(
NNlib.sigmoid(x[1:1,:]), NNlib.identity(x[2:end,:]) # PROBLEMATIC?
)
push!(layers, Dense(n_intermediate, n_OUT, custom_final_activation))
# chain layers together
model = Chain(
layers...
) |> gpu;
# pull out data
const bs = 20
train_loader = Flux.DataLoader((xs, ys), batchsize=bs, shuffle = true);
batch = gpu(first(train_loader))
# evaluate model
val, grads = Flux.withgradient(model) do m
result = m(batch[1])
Flux.Losses.mse(result, batch[2])
end