Julia currently doesn’t use native float16 operations even if available and it adds some overhead due to the conversion overhead. If you were memory bound you might see some improvement but I doubt it. In addition I believe very few x86 chips have float16 operations so your mileage may vary. Hardware Float16 on A64fx · Issue #40216 · JuliaLang/julia · GitHub
If you have float16 operations you could try a build from source and comment out PM->add(createDemoteFloat16Pass()); in aotcompile.cpp to check if there is a possible improvement there. If you do there might be a 2x improvement