In trying to write a clone of numpy API in Julia, I see the numpy.array has an argument which accepts only 4 values but is other a string in Python.
The 4 possibles values are C , F, A, K. So I want to make sure that arg only accepts these 4 values. So the first thing I tried is to make a struct with a construction
struct ARRAY_ORDER
order::Char
ARRAY_ORDER(order) = begin
@assert order in ('C', 'F', 'A', 'K')
new(order)
end
end
then I thought, actually because the order value is typed Char technically someone can make a constructor and do type piracy and allow some other value to be input. So that’s not ideal.
Another way is to put an assert into the array function in Julia but I prefer to use types of restrict.
AI says to use Enum which is also problematic as it requires conversation and I want to store the order as Char as originally intended.
Then I focused on this solution which I think is safe from type-piracy.
struct ARRAY_ORDER
order::Union{Val{'C'}, Val{'F'}, Val{'A'}, Val{'K'}}
ARRAY_ORDER(order::Char) = begin
new(Val(order))
end
ARRAY_ORDER(order::String) = begin
@assert length(order) == 1 "order must be a single character"
orderchar = String[1]
new(Val(orderchar))
end
end
which is using Val and setting order to be a few distinct values using the Val constructor to turn values into types.
There is not a single best way I think. However you should realize that there are tradeoffs here! When you put more information in the type domain, the compiler will try to use it! This can dramatically change the performance of generated code.
E.g. this
Means that the compiler does not know the type of the order field which could lead to runtime dispatches. These are a lot more costly than simple if-else checking the value. OTOH if the type is inferrible from the context then the check can be ellided which is faster of course.
So basically there are multiple routes:
Not using types to constrain the values. This is essentially the numpy way. I.e. you have a Char field and check it’s value. This is simple, easy and doesn’t come with compiler/performance implications
Using types to model the values but keep them as field. In this case I would advise against a Union because you need to rely on union-splitting for performance and instead use a SumType from LightSumTypes.jl. However conceptually this is not too different from checking the values with if-else and just has potential performance pitfalls on top imo.
You could make the the ARRAY_ORDER struct parametric itself instead of storing the order as field (and then constrain the possible values of the parameter in the constructor). This has the potential to be fastest as long as the type stays inferrible. It is also conceptually cleanest since now the information is completely in the type domain.
The only way ARRAY_ORDER is protected from a user expanding the orders is that the fields in structs are currently not redefineable, which may change one day. Assuming the user strangely knows how to implement new useful behaviors for functions like numpy.array, nothing stops them from reading your source code and defining methods to allow more orders, maybe with a variant of ARRAY_ORDER or without it at all. Tradeoff of Julia being interactive is you can’t protect users from their creativity. Much of NumPy is safe from tampering (and reflection) because a lot of it is in C; maybe juliac can pull something like this off.
Assuming they’re not willing to tamper with API functions, then it’s fine to do the necessary branch over the different orders and throw the “order not understood” error otherwise.