UInt128 to Float64 conversion

Came upon an interesting blogpost where the author explains how they implemented their own unsigned 128 bit int to 64 bit float conversion in rust just because they didn’t think it existed, and when they realised it did exist it turned out their implementation was almost twice as fast.

Curious as one is, I had a look at the julia implementation and saw it had some branches and things that might allow for this to be an improvement. So here are julias default version, the branchy version from the blog and the optimized version from the blog (pretty much, with some minor differences).

Code
function default(x::UInt128) # Default julia version
    x == 0 && return 0.0
    n = 128-leading_zeros(x) # ndigits0z(x,2)
    if n <= 53
        y = ((x % UInt64) << (53-n)) & 0x000f_ffff_ffff_ffff
    else
        y = ((x >> (n-54)) % UInt64) & 0x001f_ffff_ffff_ffff # keep 1 extra bit
        y = (y+1)>>1 # round, ties up (extra leading bit in case of next exponent)
        y &= ~UInt64(trailing_zeros(x) == (n-54)) # fix last bit to round to even
    end
    d = ((n+1022) % UInt64) << 52
    reinterpret(Float64, d + y)
end

function simple(x::UInt128) # Branchy version from blog
    if x == 0
        return 0
    end
    n = leading_zeros(x)
    y = x << n
    exponent = (127 - n) + 1023
    mantissa = y << 1 >> 76
    dropped_bits = y << 53
    if dropped_bits > 1 << 127 || (dropped_bits == 1 << 127 && mantissa & 1 == 1)
        mantissa += 1
    end
    reinterpret(Float64, UInt64(exponent) << 52 + UInt64(mantissa))
end

function fast(x::UInt128) # No branch (the one is optimized away) version from blog
    n = leading_zeros(x)
    y = x << n  # Dont need bitrotate as in rust version
    mantissa = (y >> 75) % UInt64
    dropped_bits = (y >> 11 | y & 0xffff_ffff) % UInt64
    mantissa += (dropped_bits - (dropped_bits >> 63 & (1 - (mantissa & 1)))) >> 63
    exponent = x == 0 ? 0x0 : (1149 - n) % UInt64 
    reinterpret(Float64, exponent << 52 + mantissa)
end
Benchmarks
julia> @benchmark default(q) setup=(q=rand(UInt128))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  5.030 ns … 38.663 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     5.516 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   5.835 ns Β±  0.929 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

      β–ˆ                                    ▁
  β–†β–ƒβ–β–‚β–ˆβ–„β–‚β–ƒβ–„β–„β–β–„β–ƒβ–„β–‚β–‚β–‚β–‚β–„β–‚β–‚β–‚β–‚β–ƒβ–‚β–‚β–ƒβ–‚β–ƒβ–ƒβ–β–‚β–‚β–β–ƒβ–ƒβ–‚β–‚β–‚β–β–ƒβ–ˆβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚ β–ƒ
  5.03 ns        Histogram: frequency by time        7.43 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark simple(q) setup=(q=rand(UInt128))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.587 ns … 37.554 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     3.727 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.075 ns Β±  0.779 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ˆ β–†  β–‚ β–‚β–„ ▁▅  β–ƒ  β–„β–‚  β–‡β–‚                                    ▁
  β–ˆβ–ƒβ–ˆβ–‡β–‚β–ˆβ–‚β–ˆβ–ˆβ–‚β–ˆβ–ˆβ–„β–…β–ˆβ–…β–„β–ˆβ–ˆβ–…β–…β–ˆβ–ˆβ–…β–…β–„β–…β–†β–…β–…β–…β–…β–…β–†β–…β–…β–…β–…β–…β–„β–…β–…β–…β–…β–…β–†β–„β–…β–…β–†β–…β–…β–…β–„β–„β–…β–…β–… β–ˆ
  3.59 ns      Histogram: log(frequency) by time     6.46 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark fast(q) setup=(q=rand(UInt128))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  3.793 ns … 37.724 ns  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     4.094 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   4.376 ns Β±  0.782 ns  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

        β–ˆ
  β–‚β–„β–‚β–β–‚β–†β–ˆβ–‚β–‚β–‚β–ƒβ–…β–‚β–β–‚β–‚β–ƒβ–„β–β–‚β–‚β–‚β–ƒβ–„β–‚β–‚β–‚β–‚β–ƒβ–„β–‚β–‚β–‚β–β–‚β–‚β–ƒβ–‚β–β–β–β–‚β–‚β–„β–ƒβ–β–‚β–β–‚β–‚β–ƒβ–‡β–„β–β–‚β–‚β–β–‚ β–ƒ
  3.79 ns        Histogram: frequency by time        5.27 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

So both versions are faster, though for some reason the simple is faster than the optimized one. And for some reason the simple version also seems to be incorrect (or at least give different answers from the other, but I assume that the current julia one is correct). Maybe this is connected to why it is fast, though I can’t see what is wrong.

julia> xs = rand(UInt128, 100000);

julia> all(Float64.(xs) .== simple.(xs))
false

julia> all(Float64.(xs) .== fast.(xs))
true

Was thinking if this was something that was interesting for julia core? Or should it rather start in a package?

I think there are more than the UInt128 to Float64 ones that could be interesting to look at also, it seems like this and a few other int to float conversions of different sizes made it in to rust by in a PR by the same person
https://github.com/rust-lang/compiler-builtins/pull/464/files

2 Likes

This looks like a totally reasonable implementation to me. I would not object to a PR.
That said, we should probably go with the version presented here https://twitter.com/m_ou_se/status/1524912035653468161 as it is apparently about 2x faster.

4 Likes

Okay, I will give it a go then, and I’ll check out the other one you found.

2 Likes