I want to create a struct X that has some parameter A that is an integer. Then, in that struct, I want to have two array, one of dimension A and one of dimension A + 2. Is there a nice way to implement it? Something that would look like that :
struct X{A}
x::Array{Int, A}
y::Array{Int, A + 2}
...
end
But, this raises the following error
ERROR: MethodError: no method matching +(::TypeVar, ::Int64)
For now though, I believe the recommendation is that your type’s constructor imposes this, and that both Ax & Ay = Ax+2 should be parameters of the type.
struct X{A,B}
x::Array{Int, A}
y::Array{Int, B}
function X(x, y)
nx, ny = (ndims(x), ndims(y))
ny == nx + 2 || throw(DimensionMismatch("$nx vs $ny"))
new{nx, ny}(x, y)
end
end
struct X{A,B}
x::Array{Int, A}
y::Array{Int, B}
function X{A,B}() where {A,B}
B == A + 2 || throw(DimensionMismatch("$A vs $B"))
new{A,B}(zeros(Int, ntuple(zero, A)), zeros(Int, ntuple(zero, B)))
end
end
Usage:
julia> X{1,3}()
X{1,3}(Int64[], Array{Int64}(0,0,0))
julia> X{1,4}()
ERROR: DimensionMismatch("1 vs 4")
Don’t hold your breath—that issue is on the 2.0 milestone as it is likely to require some subtle but breaking changes to the language. In the meantime, as @chakravala has linked to, you can use the ComputedFieldTypes package to get the same effect.
I used to be very excited about this feature, but I no longer think it is crucial, in fact, I think it would be a mistake to implement it.
Given Julia’s very rich type system, it is a natural first impulse to want to use it for computation. I think that almost all new users who are experienced programmers go overboard with type system gymnastics when they learn the language.
But, at the end of the day, this feature does not buy you anything in terms of elegance or speed (you can always just do checks in the inner constructors, and calculations in the outer ones).
One issue I have with Julia’s type parameters, related to this, is that some of them are part of the public API and some of them are private. For above example A might be public, but B isn’t. Another one is:
Where the first parameter is public (you would potentially use it for dispatch) but the second two are internal ones (I think calculated from the first somehow but I’m not sure).
It would be nice if it was possible to distinguish between the two somehow and hide the private parameters. In particular when declaring structs with fields for, e.g. float-ranges, one has to handle those private parameters:
struct MyT{F,_F1,_F2}
fr:StepRangeLen{F, Base.TwicePrecision{_F1},Base.TwicePrecision{_F2}}
end
There are cases where implementation details leak out to a user through being forced to use type parameters.
For example, let’s say we want to store a symmetric static array. For a 3x3 matrix this would be 6 elements. The way to do that right now in Julia is:
struct SymmetricMatrix{T, N, K}
NTuple{K, T}
end
where we compute K based on N. If we restrict ourselves to only generalizing over T we can write it as:
struct SymmetricMatrix1x1{T}
NTuple{1, T}
end
struct SymmetricMatrix2x2{T}
NTuple{3, T}
end
struct SymmetricMatrix3x3{T}
NTuple{3, T}
end
so the fact that we want to generalize over one parameter forces us to introduce two new parameters, one of them being an implementation detail of the storage of the data.
A user wanting to store a general symmetric static matrix in a struct needs to do:
struct MyModel{T, N, K}
K1::SymmetricMatrix{T, N, K}
K2: SymmetricMatrix{T, N, K}
end
Omitting the K would give a type instability, even though K is statically known from N. This is very easy to do since one doesn’t interact with K anything in the code.
Being able to write something like
struct SymmetricMatrix{T, N}
NTuple{div(N*(N+1), 2), T}
end
would be very nice in this case (and it is possible to do in other languages).
I think whether the user “cares about” the extra parameter is context-dependent; sometimes it is relevant. OTOH, the compiler always cares. Looking at it this way, this is similar to the need to parametrize fields, just to get concrete types, eg
struct Foo{T}
field::T
end
# vs
struct Foo
field
end
I see your point that when extra parameters are computable, it would be convenient. But my understanding is that this requires that the computation is @pure, which has proven to be tricky in practice from a user perspective.
Maybe there is a nice fix, but I see this as a very minor inconvenience that I learned to work around automatically.
The value of K might be interesting for a user but there is never a reason to parameterize on it. Instead K would be retrieved with a function like nstored_elements(::SymmetricMatrix). The parameterization is only there to please the type system.
The point is not about parameterization in general, it is clear why this is needed and useful.
We can do operations inside generated functions with the caveat that the computation do not update with world age. Sure, there are problems doing arbitrary computations within the type system since the computations themselves might need to invoke the type system. All I am talking about is that it causes real problems that are annoying to deal with.
This results in SBlade{T,V,G,B} with B(V,G) = binomial(ndims(V),G).
However, the type system allows dispatch on SBlade{T,V}, which is automatically converted into SBlade{T,V,G,B} where {G,B}, which means that if you drop the last type parameters in the list… they are automatically treated as optional.
This means that usually, it is better to make the most important dispatch parameters FIRST in the list, and the optional ones last. However, it is always possible to make any parameter optional in any order, for example
SBlade{T,V,G,B} where {V,B}
dispatches on T and G but makes V and B optional.
In my experience, if you already know that the type parameters are bits types, then you are safe declaring methods involving them as @pure, which I have already done extensively in practice with the Grassmann.jl package
So you are right, @pure is needed in many cases, but it can be used safely if you have a stable design.
For example, N is always going to be an integer and V a VectorSpace.
For what it’s worth, in this (and many cases), the “public” type parameters are available through a superclass: AbstractVector{T}. If you’re explicitly dispatching against StepRangeLen, then you’re already doing something that’s specific to StepRangeLen and isn’t generic — by definition.
Now, sure, if you don’t have that public superclass you do need to be careful about which parameters are stable and which are implementation details, but given that you need to look at the documentation or implementation to figure out what those parameters mean in the first place, I don’t find it so abhorrent.