Flux vs pytorch cpu performance

No. The most optimized functions are probably those provided by GLIBC on Linux, which includes some basics like exp but not tanh.
It’s open source (although GPLed) so we could translate it to Julia or LLVM bitcode.
The functions are written in x86 assembly (different versions for different instruction sets), and make use of look up tables referenced by address, so it wouldn’t easy to translate.

The biggest advantage of that would be that Windows and Mac users can benefit, as well as people who haven’t upgraded their Linux distro in years.

I thought the expm1 definition just used double-doubles (or double-singles) for a more accurate exp - 1. Maybe it isn’t dispatching to the right exp.

@mcabbott Are those benchmarks with Julia 1.4? You can try Julia 1.6 (or 1.5) for the fast @avx $y .= tanh.($x).

DhairyaLGandhi, Chris R., and I had a meeting today where we discussed vectorizing activation functions. The basic plan is a whitelist:

# For evaluating dense activation function `f` on CPU
if canavx(f)
    @avx f.(WX .+ b)
else 
    f.(WX .+ b)
end

# elsewhere
canavx(::typeof(tanh)) = true
1 Like