Slowdown due to subnormal float, coming from neural net training

I was not planing to compare anything except to see that this is effectively the default behavior in another language I develop (please let us not give fuel to another pointless “fairness” benchmark discussion). Of course we can tune that with compiler flags.

Edit, for the records:

julia> @btime subnormal_fortran!($C,$A2,$B,$b); # no opt flags
  116.997 μs (0 allocations: 0 bytes)

julia> @btime subnormal_fortranmarchnative!($C,$A2,$B,$b); # -march=native
  117.876 μs (0 allocations: 0 bytes)

julia> @btime subnormal_fortranO3!($C,$A2,$B,$b); # -O3
  26.371 μs (0 allocations: 0 bytes)

julia> @btime subnormal_fortranO3marchnative!($C,$A2,$B,$b); # -O3 -march=native
  13.739 μs (0 allocations: 0 bytes)

julia> @btime subnormal_fortranOfast!($C,$A2,$B,$b); # -Ofast
  753.076 ns (0 allocations: 0 bytes)

julia> @btime subnormal_fortranOfastmarchnative!($C,$A2,$B,$b); # -Ofast -march=native
  341.280 ns (0 allocations: 0 bytes)
1 Like