Fast LogSumExp over 4th dimension

Ah that is surprising its basically my second time writing a Julia function (I’m familiar with numpy.einsum in Python though so that helps).

Hmm… I think then I’ll have to stick with my R version. Unless there is more improvement to be found here, the overhead of moving data from R to Julia which isn’t captured in my R benchmarks above that erases small benefits (Fast 4D argmax - #23 by Non-Contradiction)