Abusing `convert` as an alternative to `Union{Nothing, Int64}`

In C it is common to have functions that return a positive integer result on success, or a negative integer if an error happens. When I use these functions, I tend to forget the possible error values and do integer arithmetic or memory management with them, creating harder-to-debug errors.

In Julia, in addition to throwing an error, functions can also return Union{Nothing, Int64} for this. But this doesn’t take advantage of the fact that no valid returns are negative at the interface level.

In [ChunkCodecCore] BREAKING change the return type to `MaybeSize` by nhz2 · Pull Request #72 · JuliaIO/ChunkCodecs.jl · GitHub
I am trying to use the convert function in base to get a safer version of the C pattern.

struct MaybeSize
    val::Int64
end
const NOT_SIZE = MaybeSize(typemin(Int64))
function is_size(x::MaybeSize)::Bool
    !signbit(x.val)
end
function Base.Int64(x::MaybeSize)
    if !is_size(x)
        throw(InexactError(:Int64, Int64, x))
    else
        x.val
    end
end
function Base.convert(::Type{Int64}, x::MaybeSize)
    Int64(x)
end
function Base.convert(::Type{MaybeSize}, x::Int64)
    if signbit(x)
        throw(InexactError(:convert, MaybeSize, x))
    else
        MaybeSize(x)
    end
end
function Base.convert(::Type{MaybeSize}, x::Nothing)
    NOT_SIZE
end
function Base.convert(::Type{Nothing}, x::MaybeSize)
    if x !== NOT_SIZE
        throw(InexactError(:convert, Nothing, x))
    else
        nothing
    end
end

The wrapping and unwrapping are then mostly automatic as long as the function and caller have explicit type annotations.

1 Like

Why does the convert method include error handling, but the constructor does not? IMO it might be nicer to have the convert method just forward to the constructor, and use a separate function for unchecked construction?

IMO ideally you would introduce a new singleton type here, instead of abusing Nothing. The convention is for Nothing to lack specific handling, I’d say.

1 Like

Allowing conversion from nothing means that functions that don’t return anything automatically return NOT_SIZE, for example:

julia> function foo()::MaybeSize
       end
foo (generic function with 1 method)

julia> foo()
MaybeSize(-9223372036854775808)

But you a right that it would be clearer to require an explicit return NOT_SIZE

MaybeSize is a plain struct, so I think the default constructor is fine.

You have discovered the “validation” part of the essay Parse, don’t validate :slight_smile: There’s a followup of that, expanding on the first: Names are not type safety.

In Julia, we sadly can’t go all the way with type safety since constructors are trivially bypassable (among other reasons), but nonetheless, putting invariant checks in constructors helps a lot for centralizing this kind of check.

2 Likes

I don’t really think validation is an accurate description of what the OP is doing here. They’re just saying that they have a C library that returns negative integers to denote errors, and they want to convert those error values into julia runtime errors.

:+1:

I’m not sure I agree this is abuse of Nothing. What is wrong with adding a convert method here?

Exactly! What I wanted to point out is that by wrapping without immediately throwing, this construction means that you always have to manually check for a potential errors down the line when you use the MaybeSize object, even after you’ve already checked (i.e. validated) that it is in fact not an error. This muddles the source of the issue (the value returned from the C library), because you generally have no clue where convert or something like that is called next.

Barring shenanigans circumventing the constructor and by throwing in the constructor on an error, you know with pretty much 100% certainty that a MaybeSize is not a maybe at all, but a guaranteed valid thing when you go on to use it, thereby preventing indexing problems or what have you. In essence, this is what it means to parse and not to validate: encoding correctness in the types of your things.

2 Likes

Its a bit more subtle than that, because some of those errors are recoverable, for example if inplace decoding fails because the dst was too small it can be retried with a larger dst.

Yes Parse, don’t validate inspired a lot of the design of the interface, especially on the encoding side, where most errors can be pre-checked during construction of the encoder, before it sees any data.

On the decoding side, there are a huge range of errors that can happen, because in general decoding is like running an interpreter on a mini programming language.
For decoding the “safe” parsing function is decode, it can only return valid data, or error.
All the other lower level decoding functions need to be used with extreme care.

1 Like

Yes, this seems like a downside to MaybeSize compared to Union{Nothing, Int64}. JET and the compiler seem to be pretty good at understanding these small Unions.
Also, particular implementation might only return an Int64, and then the compiler and JET might be able to DCE any isnothing branches in the caller. I guess this is, in theory, possible with MaybeSize, but it would require the compiler to infer things about the relationships between the possible ranges of values returned.

1 Like

It attaches a different meaning (“punning”) to Nothing than the one I think is intended by Base, making conversion succeed instead of throwing with “cannot convert a value to nothing for assignment”. Admittedly, the documentation around Nothing is lacking, however my interpretation is that Nothing is not supposed to get any specific handling in the functions defined by Base Julia apart from the few methods already defined. Specifically, nothing is not supposed to be convertible to other (non-Nothing) types. This way is more safe, as unintended assignment etc. fails early.

Apart from the Missing-specific handling, this is the entire Nothing-specific conversion logic:

1 Like

I dont really see how this is a pun. Its just saying that there is a meaningful way to convert the failure flag to a nothing value and vice versa, but no way to convert a regular value to/from nothing

Well, yeah. That’s the punning. As far as I understand, the semantics of Nothing include nothing not being convertible to/from other types. Thus this “meaningful way” introduces a new, different meaning for the same type (Nothing), which is what we call punning.

3 Likes

This got me to think more about what the invariants on the return are.

For try_decode it has:

if is_size(ret)
  @assert Int64(ret) ∈ 0:length(dst)`
end

But I could remove this invariant by returning ret ∈ 0:length(dst) if the decoding was successful, otherwise return ret > length(dst) if the decoding failed due to not enough space in dst.

Ah I see what you mean. Yeah, that’s fair.

2 Likes