Function speed up, reduce allocations

I noticed now that M is not an input of layer. Probably it should be.

4 Likes