Speed Comparison Python v Julia for custom layers

@Tomas_Pevny I tried your function but the type assertions following Dense were giving me and error. I then realised there isn’t a fast_softmax in NNlib. Also I don’t really get how this would drop in to my code?

@ToucheSir I’ve read this blog on simple chains: Doing small network scientific machine learning in Julia 5x faster than PyTorch but realised that this processes each layer once after the other. To implement my layer with simplechains either I need to have multiple different chains, which seems like a complicated way of doing this. Or I can implement my own layer in simplechains but it seems like to do this I would basically just be answering my question. It seems like using the optimisations discussed in the blog are in the direction that I want, but I don’t know where to start doing this.