Hey there,
I see from the manual that different rounding modes for floats exist. However, I was wondering whether there is a smart way to set or emulate a rounding mode that is similar to RoundNearest, but without overflow nor underflow. The smallest representable number is never round to zero and the largest never to infinity. At the moment we have
julia> a = prevfloat(typemax(Float16))
Float16(6.55e4)
julia> a+a
Inf16
causing an overflow and
julia> b = nextfloat(zero(Float16))
Float16(6.0e-8)
julia> b/2
Float16(0.0)
causing underflow. Similarly for Float32, Float64 and negative numbers. So in the first example I would like a+a to yield a and in the second b/2 to yield b. This is motivated as 2a is closer to a than to infinity … Do you know whether there is any way to set this behaviour?
You could create your own type which wraps a float and then implement the basic operations yourself, using whatever rounding scheme you like. Something like:
struct MyFloat{F <: AbstractFloat} <: Number
value::F
end
Base.:+(f1::MyFloat, f2::MyFloat) = MyFloat(your_particular_rounding_scheme(f1.value + f2.value))
I also vaguely remember there being a helpful package designed to make it a bit easier to create custom Number types, but i can’t remember what it’s called (nor can google find it). Perhaps someone else here will remember…
You can try FiniteFloats.jl. Infinities are avoided, as are many NaNs. I do not avoid zeros, so prevfloat(nextfloat(0.0)) will be 0.0. Post an issue if any arises.
Please let me know if you find it is useful to you.