Flux Activations() computes the forward pass again

It looks like the activations() function (in basic.jl) computes the forward pass
to obtain the activations, which would mean that it is computed twice.

I guess a fix for this would be to implement a custom alternative to Chain()
that propagates both the prediction and a list of activations
(or, the regularization term involving the activations)

@jaynick could you open an issue on Flux?