Using Flux I would like to apply an activation function to all but the first output unit. More precisely, I would like to constrain all but the first unit to be non-negative. How can I achieve that? Thanks.
Since in Flux, all layers and activation functions are just functions, this is straightforward to implement.
Let’s say we’re working with a
Dense layer with a
relu activation. Now usually you would construct your layer with
Dense(nin, nout, relu) for
relu to be applied to every output of the
We can write a custom activation layer that applies a regular activation function to all but the first output as follows:
struct PartialActivation activationfn end (pa::PartialActivation)(xs) = map((i, x) -> i == 1 ? x : pa.activationfn(x), eachindex(xs), xs)
This will apply the
activationfn to all but the first element.
To use it in a model, you will have to switch from something that probably looks like
Chain(.., Dense(10, 10, relu), ...)
Chain(.., Dense(10, 10), PartialActivation(relu) ...)
Hope this helps and feel free to ask questions!
Thanks, but it does not seem to work for arrays of dimension larger than one. In my case Flux expects 2-dimensional arrays (number of outputs x batch size). That is, the function should be applied to all but the first row.
I also consider to circumvent the problem by adding a sufficiently high number to the targets/responses such that the transformation can be applied to all output variables.