Is there a way to reinterpret a Float64 to 2 Int32?

I’m writing some manually vectorize some code (for computing exp), and I was looking for a way to view the memory as 2 int32s (if it helps, I only need the lower one). Is there a way to do this? The reason I want Int32 rather than int64 is because I want the integer stuff to work with avx2.

julia> x = 3.0

julia> reinterpret(Int32, [x])
2-element reinterpret(Int32, ::Array{Float64,1}):
julia> x = rand()

julia> bitstring(x)

julia> reinterpret(Int, x) % Int32

julia> bitstring(ans)

Similarly, you can:

julia> using VectorizationBase, SIMDPirates

julia> sx = SVec(ntuple(VectorizationBase.pick_vector_width_val(Float64)) do _ Core.VecElement(10randn(Float64)) end)
SVec{8,Float64}<11.591414148467251, -16.471476122659084, -5.030081855765106, -8.60206169897962, -17.824076876082163, -12.363521352234141, 1.0686657131481578, 7.031664824493625>

julia> reinterpret(SVec{length(sx),Int}, sx) % Int32
SVec{8,Int32}<-707518984, -1463834008, -953492670, 1851168085, -1279252055, 2021555350, 939867877, -1100576536>

AVX2 doesn’t have instructions for SIMD

  • Int64 multiplication
  • Int64 → Float64 conversion
  • Float64 → Int64 conversion

So you could use Int32 if it helps you avoid these.
But reinterpreting between SIMD vectors of Int64 and Float64 is perfectly fine, because it doesn’t require any operations at all.


This is A way, but I find it doubtful it’s the best way, and possibly it’s buggy. I extended Elrod’s code to (his SIMD code doesn’t have the shift issue, but it (currently) ties you to x86, my code should work everywere):

other_half(x) = (reinterpret(Int, x) >> 32) % Int32

julia> @code_native other_half(x) # ok, I'm doubt the shift can be avoided.

This might be alternative code to consider:
julia> f(x) = (reinterpret(UInt64, x) & typemax(UInt32) << 32) % Int32

I have two concerns with your code. a) [x] assumes x is in memory? I guess the compiler is careful either way, but if your value is in a register it would force it stored in memory? b) It seems it depends on endianness, and the order you get is for little-endian. It could be reversed, in some theoretical (not yet supported) big-endian platform?

I used @code_native on your code, and it seems longer than I would want, and @code_lowered is way longer.

Similar for splitting Float32:

julia> both_half(x::Float32) = ((reinterpret(UInt32, x) >> 16), reinterpret(UInt32, x) % Int16)

Julia still does a shift, here by 16, and it should be avoidable, as x86 assembly has access to lower 16 bits, and higher 16 bits of the same register (but not higher 32-bits of 64 bit register without shifting), so Julia’s code generator could in theory do something slightly better (doesn’t not even on -O3). [I don’t see a way around it otherwise, except maybe with injecting LLVM code.] I’m assuming x86 register set and not SIMD, I just don’t recall, such likely might avoid generating a shift on x86 (in case Julia’s code generator tried to exploit such SIMD feature), even for 32-bit, and probably even ARM, without ARM would need a shift.

1 Like