Lux.jl Chain of Dense layers don't see speed up going from 32 bits -> 16 bits

Julia currently doesn’t use native float16 operations even if available and it adds some overhead due to the conversion overhead. If you were memory bound you might see some improvement but I doubt it. In addition I believe very few x86 chips have float16 operations so your mileage may vary. Hardware Float16 on A64fx · Issue #40216 · JuliaLang/julia · GitHub

3 Likes