I’ve recently tagged a v0.3 of Quadmath.jl: this provides a `Float128`

type which implements IEEE 128-bit floating point numbers, and runs on Windows, Linux and MacOS.

These offer higher-precision than hardware-supported floats (`Float32`

/`Float64`

), but should be faster than arbitrary-precision `BigFloat`

s.

17 Likes

Could you give a comparison to `DoubleFloats`

?

1 Like

It should have marginally better precision and a much greater exponent range than `DoubleFloats`

. I suspect the performance would be worse, but haven’t compared them.

2 Likes

The second image in the LLLplus.jl README shows a speed comparison between LLL decomposition of 128x128 matrices of `Float128`

and `Double64`

(from DoubleFloat.jl). In this case they have similar speed.

2 Likes

For `+, *, exp, sin`

, using `@btime +(Ref($x)[], Ref($y)[])`

etc, I get DoubleFloats faster by (ratios) 3.8x, 6.8x, 1.7x, 1.6x. [just one machine] Of course, Quadmath offers 32ish bits more precision and many more exponents.

For multiplying 128x128 matrices, 3.7x (@btime Float128 / @btime Double64)

2 Likes

With single-threaded SIMD, one can get about 13x Float128/Double64 for 128x128 GEMM. (You should be confident of avoiding overflow to use this for real work.) Proof of concept at DoubleBLAS.jl, with multithreaded versions approaching 30x Double64/Float64.

3 Likes

It looks to me like DoubleFloat.jl has been updated and is much faster now on many linear algebra tasks like ‘qr’:

```
using LinearAlgebra
using DoubleFloats
using Quadmath
using BenchmarkTools
N = 50;
F64 = randn(N,N);
F128 = Float128.(F64);
D64 = Double64.(F64);
Big = BigFloat.(F64);
@belapsed qr($F128) # 0.00342
@belapsed qr($D64) # 0.000920
@belapsed qr($Big) # 0.0103
```

2 Likes