In C it is common to have functions that return a positive integer result on success, or a negative integer if an error happens. When I use these functions, I tend to forget the possible error values and do integer arithmetic or memory management with them, creating harder-to-debug errors.
In Julia, in addition to throwing an error, functions can also return Union{Nothing, Int64} for this. But this doesn’t take advantage of the fact that no valid returns are negative at the interface level.
struct MaybeSize
val::Int64
end
const NOT_SIZE = MaybeSize(typemin(Int64))
function is_size(x::MaybeSize)::Bool
!signbit(x.val)
end
function Base.Int64(x::MaybeSize)
if !is_size(x)
throw(InexactError(:Int64, Int64, x))
else
x.val
end
end
function Base.convert(::Type{Int64}, x::MaybeSize)
Int64(x)
end
function Base.convert(::Type{MaybeSize}, x::Int64)
if signbit(x)
throw(InexactError(:convert, MaybeSize, x))
else
MaybeSize(x)
end
end
function Base.convert(::Type{MaybeSize}, x::Nothing)
NOT_SIZE
end
function Base.convert(::Type{Nothing}, x::MaybeSize)
if x !== NOT_SIZE
throw(InexactError(:convert, Nothing, x))
else
nothing
end
end
The wrapping and unwrapping are then mostly automatic as long as the function and caller have explicit type annotations.
Why does the convert method include error handling, but the constructor does not? IMO it might be nicer to have the convert method just forward to the constructor, and use a separate function for unchecked construction?
IMO ideally you would introduce a new singleton type here, instead of abusing Nothing. The convention is for Nothing to lack specific handling, I’d say.
In Julia, we sadly can’t go all the way with type safety since constructors are trivially bypassable (among other reasons), but nonetheless, putting invariant checks in constructors helps a lot for centralizing this kind of check.
I don’t really think validation is an accurate description of what the OP is doing here. They’re just saying that they have a C library that returns negative integers to denote errors, and they want to convert those error values into julia runtime errors.
Exactly! What I wanted to point out is that by wrapping without immediately throwing, this construction means that you always have to manually check for a potential errors down the line when you use the MaybeSize object, even after you’ve already checked (i.e. validated) that it is in fact not an error. This muddles the source of the issue (the value returned from the C library), because you generally have no clue where convert or something like that is called next.
Barring shenanigans circumventing the constructor and by throwing in the constructor on an error, you know with pretty much 100% certainty that a MaybeSize is not a maybe at all, but a guaranteed valid thing when you go on to use it, thereby preventing indexing problems or what have you. In essence, this is what it means to parse and not to validate: encoding correctness in the types of your things.
Its a bit more subtle than that, because some of those errors are recoverable, for example if inplace decoding fails because the dst was too small it can be retried with a larger dst.
On the decoding side, there are a huge range of errors that can happen, because in general decoding is like running an interpreter on a mini programming language.
For decoding the “safe” parsing function is decode, it can only return valid data, or error.
All the other lower level decoding functions need to be used with extreme care.
Yes, this seems like a downside to MaybeSize compared to Union{Nothing, Int64}. JET and the compiler seem to be pretty good at understanding these small Unions.
Also, particular implementation might only return an Int64, and then the compiler and JET might be able to DCE any isnothing branches in the caller. I guess this is, in theory, possible with MaybeSize, but it would require the compiler to infer things about the relationships between the possible ranges of values returned.
It attaches a different meaning (“punning”) to Nothing than the one I think is intended by Base, making conversion succeed instead of throwing with “cannot convert a value to nothing for assignment”. Admittedly, the documentation around Nothing is lacking, however my interpretation is that Nothing is not supposed to get any specific handling in the functions defined by Base Julia apart from the few methods already defined. Specifically, nothing is not supposed to be convertible to other (non-Nothing) types. This way is more safe, as unintended assignment etc. fails early.
Apart from the Missing-specific handling, this is the entire Nothing-specific conversion logic:
I dont really see how this is a pun. Its just saying that there is a meaningful way to convert the failure flag to a nothing value and vice versa, but no way to convert a regular value to/from nothing
Well, yeah. That’s the punning. As far as I understand, the semantics of Nothing include nothing not being convertible to/from other types. Thus this “meaningful way” introduces a new, different meaning for the same type (Nothing), which is what we call punning.
This got me to think more about what the invariants on the return are.
For try_decode it has:
if is_size(ret)
@assert Int64(ret) ∈ 0:length(dst)`
end
But I could remove this invariant by returning ret ∈ 0:length(dst) if the decoding was successful, otherwise return ret > length(dst) if the decoding failed due to not enough space in dst.