I’ve recently tagged a v0.3 of Quadmath.jl: this provides a Float128
type which implements IEEE 128-bit floating point numbers, and runs on Windows, Linux and MacOS.
These offer higher-precision than hardware-supported floats (Float32
/Float64
), but should be faster than arbitrary-precision BigFloat
s.
17 Likes
Could you give a comparison to DoubleFloats
?
1 Like
It should have marginally better precision and a much greater exponent range than DoubleFloats
. I suspect the performance would be worse, but haven’t compared them.
2 Likes
The second image in the LLLplus.jl README shows a speed comparison between LLL decomposition of 128x128 matrices of Float128
and Double64
(from DoubleFloat.jl). In this case they have similar speed.
2 Likes
For +, *, exp, sin
, using @btime +(Ref($x)[], Ref($y)[])
etc, I get DoubleFloats faster by (ratios) 3.8x, 6.8x, 1.7x, 1.6x. [just one machine] Of course, Quadmath offers 32ish bits more precision and many more exponents.
For multiplying 128x128 matrices, 3.7x (@btime Float128 / @btime Double64)
2 Likes
With single-threaded SIMD, one can get about 13x Float128/Double64 for 128x128 GEMM. (You should be confident of avoiding overflow to use this for real work.) Proof of concept at DoubleBLAS.jl, with multithreaded versions approaching 30x Double64/Float64.
3 Likes
It looks to me like DoubleFloat.jl has been updated and is much faster now on many linear algebra tasks like ‘qr’:
using LinearAlgebra
using DoubleFloats
using Quadmath
using BenchmarkTools
N = 50;
F64 = randn(N,N);
F128 = Float128.(F64);
D64 = Double64.(F64);
Big = BigFloat.(F64);
@belapsed qr($F128) # 0.00342
@belapsed qr($D64) # 0.000920
@belapsed qr($Big) # 0.0103
2 Likes