How do I implement the Parametric ReLU (PReLU) function in Flux v0.11.1?

I want to handle the Parametric ReLU (PReLU) function in Flux v0.11.1.
https://arxiv.org/abs/1502.01852

I’m trying to implement it in Flux based on the following post, but the version of Flux in this post is old and many of them are not available in v0.11.1.

The following code was created based on the official Flux documentation and forum posts.

struct PReLU_Dense
    W
    b
    α
end

Flux.trainable(a::PReLU_Dense) = (a.W,a.b,a.α)
PReLU_Dense(in::Integer, out::Integer, α) = PReLU_Dense(randn(out, in), randn(out), α)
prelu(x, α) = x > 0 ? x : α*x 
function (m::PReLU_Dense)(x)
    prelu.(m.W * x .+ m.b, m.α)
end

Flux.@functor PReLU_Dense

m = Chain(PReLU_Dense(2, 4, 0.1),
          Dense(4,1))

This code works, but the learning doesn’t take place well.
Therefore, we know it is incomplete.

My skill with the Julia language is still in its infancy and I don’t know what to do about it.
Let me know if you have any ideas.

version:
Julia 1.4.1
Flux v0.11.1

Thanks.

It should be sufficient to create a regular Dense layer that has your prelu as its activation. Creating a new layer type on your own isn’t really necessary.

PReLU_Dense(n, m, α) = Dense(randn(m, n), randn(m), x->prelu(x, α))

Now everything else should already just work.

1 Like

Thanks.
I have confirmed that there is no need to create a new layer.
I’ll paste the code below to confirm.
Please let me know if you have any concerns.

struct PReLU_Dense
    W
    b
    α
end
Flux.trainable(a::PReLU_Dense) = (a.α)
prelu(x, α) = x > 0 ? x : α*x 
PReLU_Dense(n, m, α) = Dense(randn(m, n), randn(m), x-> prelu(x, α))
Flux.@functor PReLU_Dense

Is there any way to check the value of alpha as a model parameter?
I know that you can save the weight and other information from the following code, but I don’t know how to save the alpha information.

julia> using Flux

julia> model = Chain(Dense(10,5,relu),Dense(5,2),softmax)
Chain(Dense(10, 5, NNlib.relu), Dense(5, 2), NNlib.softmax)

julia> weights = params(model);

julia> using BSON: @save

julia> @save "mymodel.bson" weights

Sorry, @RiN, I’m afraid I completely misunderstood initially and led you the wrong way. If you want α to be a learnable parameter, you can’t just use Dense like I suggested. Your initial approach is correct.

The problem in learning seems to be that the Flux.params behavior is to return only mutable fields. See:

julia> struct PReLU_Dense
           W; b; α
       end

julia> Flux.@functor PReLU_Dense

julia> Flux.trainable(PReLU_Dense(1,2,3))
(W = 1, b = 2, α = 3)

julia> Flux.params(PReLU_Dense(1,2,3))
Params([])

When the fields are arrays instead:

julia> Flux.params(PReLU_Dense([1],[2],[3]))
Params([[1], [2], [3]])

Therefore I think the quickest thing to do in your case is to simply make α a length-1 vector so its value is mutable (rather than making the entire layer a mutable struct to accommodate it). You will need to modify the layer application function as, e.g.:

function (m::PReLU_Dense)(x)
    prelu.(m.W * x .+ m.b, m.α[1])
end

Thanks.
By giving it as a vector as you advised, I was able to confirm that the alpha changes with learning.
My deepest gratitude to you!