Bit Constraints for floats and integers

Sometimes you want to parameterize your function so that you can dispatch in 64, 32, 16, or some other bit numbers rather than just 64 bit numbers, the problem with this is that just using a single parametric type variable for example T = Float32 may result in allowing mutation into 64 bit calculation (for a native 64 bit system) for instance in a for loop for i in 1:n the type of i defaults to Int64 if I is used in the calculating within the loop. An example where this issue occurs is here Performance comparison with C++ which was one of the reasons the code was slow. I myself have met this issue some time ago but (thankfully) realised what the problem was pretty quickly. One way to go is to parameterize both integer and floats separately in the function but even if you add an integer type specifier, there is no guarantee that you get the same bit type as integer and float.

The code below can help, it still requires the user to label the type of the numbers in the for loop (in this case for i in I(i):I(n)) but it is a tool that can be used to make the function do what you want it to do but keep the generic functionality, so that it can be run on any valid set of bit numbers.

# Types
abstract type AbstractBit{F <: AbstractFloat, I <: Integer} end
struct Bit64Number{F <: Float64, I <: Int64} <: AbstractBit{F, I} end
struct Bit32Number{F <: Float32, I <: Int32} <: AbstractBit{F, I} end
struct Bit16Number{F <: Float16, I <: Int16} <: AbstractBit{F, I} end

# Alias
Bit64 = Bit64Number{Float64, Int64}
Bit32 = Bit32Number{Float32, Int32}
Bit16 = Bit16Number{Float16, Int16}

# Test function
function myFun(::Type{<: AbstractBit{F, I}}, x::F, y::I) where {F, I}
  println("The floating point number x: $x has type $F, and the integer number y: $y has type $I")
  return (x, y)

# Same types work fine
myFun(Bit64, 3.142, 42)
myFun(Bit32, Float32(3.142), Int32(42))
myFun(Bit16, Float16(3.142), Int16(42))

# Mixed type fails as expected
myFun(Bit32, Float32(3.142), 42)
1 Like

I am not quite sure if this is a question or a proposal, but something similar can be done quite simply with eg

function myFun2(x::F, y::I) where {F <: AbstractFloat, I <: Integer}
    @assert sizeof(F) == sizeof(I)
    println("The floating point number x: $x has type $F, and the integer number y: $y has type $I")
    return (x, y)

It’s a proposal meant to spark discussion of stuff I didn’t think of and I hoped the approach is useful, apologies for not being clear enough.

Your suggestion is of course perfectly valid, however my solution is aimed at a situation where you are perhaps writing a library and need to do this motif in lots of places so you don’t have to keep stating the same @assert constraint and type information over and over, you simply want to get the type and use it, so it has a single point of control.

Mine might be more performant? because it doesn’t need to evaluate the @assert Under @code_warntype your solution has more instructions but they seem to be compiler constants and the assertion error is flagged ahead of time so performance might be similar :man_shrugging:

Contrary to popular misconception, the floating-point precision has nothing to do with whether you have 32- or 64-bit addresses. 32-bit machines have always been able to do native (hardware) Float64 arithmetic.

Because floating-point literals in Julia are a specific precision (Float64), a little care is required to write floating-point code that uses the precision of the function arguments. It would be nice to have more documentation on these techniques.


The compiler should elide it for concrete types.