[ANN] Quadmath.jl

I’ve recently tagged a v0.3 of Quadmath.jl: this provides a Float128 type which implements IEEE 128-bit floating point numbers, and runs on Windows, Linux and MacOS.

These offer higher-precision than hardware-supported floats (Float32/Float64), but should be faster than arbitrary-precision BigFloats.

17 Likes

Could you give a comparison to DoubleFloats?

1 Like

It should have marginally better precision and a much greater exponent range than DoubleFloats. I suspect the performance would be worse, but haven’t compared them.

2 Likes

The second image in the LLLplus.jl README shows a speed comparison between LLL decomposition of 128x128 matrices of Float128 and Double64 (from DoubleFloat.jl). In this case they have similar speed.

2 Likes

For +, *, exp, sin, using @btime +(Ref($x)[], Ref($y)[]) etc, I get DoubleFloats faster by (ratios) 3.8x, 6.8x, 1.7x, 1.6x. [just one machine] Of course, Quadmath offers 32ish bits more precision and many more exponents.

For multiplying 128x128 matrices, 3.7x (@btime Float128 / @btime Double64)

2 Likes

With single-threaded SIMD, one can get about 13x Float128/Double64 for 128x128 GEMM. (You should be confident of avoiding overflow to use this for real work.) Proof of concept at DoubleBLAS.jl, with multithreaded versions approaching 30x Double64/Float64.

3 Likes

It looks to me like DoubleFloat.jl has been updated and is much faster now on many linear algebra tasks like ‘qr’:

using LinearAlgebra
using DoubleFloats
using Quadmath
using BenchmarkTools
N = 50;
F64 = randn(N,N);
F128 = Float128.(F64);
D64 = Double64.(F64);
Big = BigFloat.(F64);
@belapsed qr($F128) # 0.00342
@belapsed qr($D64)  # 0.000920
@belapsed qr($Big)  # 0.0103
2 Likes