Apply reduction along specific axes

wow! amazing! Now I have
import StatsFuns
using BenchmarkTools
function logsumexp1(u, axes)
m = maximum(u, axes)
return m .+ log.(sum(exp.(u .- m), axes))
end
u = rand(10,100000);
v = rand(100000,10);
@btime logsumexp1($u, 2);
@btime logsumexp1($u, 1);
@btime logsumexp1($v, 2);
@btime logsumexp1($v, 1);
15.428 ms (15 allocations: 7.63 MiB)
22.213 ms (8 allocations: 9.92 MiB)
17.118 ms (18 allocations: 9.92 MiB)
15.849 ms (5 allocations: 7.63 MiB)

For comparison with scipy, I get:
%timeit logsumexp(u,0)
22.2 ms ± 98.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [103]: %timeit logsumexp(u,1)
17.7 ms ± 68.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [107]: %timeit logsumexp(v,0)
20.7 ms ± 209 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [108]: %timeit logsumexp(v,1)
24.9 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)