Short type declarations (syntax sugar)?

sairus7 · July 7, 2019, 1:07pm

We already have some short array syntax, like Array{Int, 1} == Vector{Int}
Also, constructor short form: Int[] (but not a type declaration)

Are there are any short forms for complex types declarations in function signatures and structure fields?

A couple of proposals:

We can use constructor semantics with special type-symbol :: in front of it to declare some types:

Arrays:

a = [1,2,3]
typeof(a)
>> Array{Int64,1}

# we can change this:
a::Array{Int64,1} # or Vector{Int64}
# into this, :: symbol prevents us from interpreting it like an array initializer
a::Int[] # variable-sized array
# also, for fixed-sized arrays (like MArray from StaticArrays.jl) we can add: 
a::Int[4] # this is 4-element array with constant size

Tuples:

t = (1, 3.)
typeof(t)
>> Tuple{Int64,Float64} 

# we can change this:
t::Tuple{Int64,Float64} 
# into this, :: symbol prevents us from interpreting it like a tuple of datatypes
t::(Int64, Float64)

Named tuples:

nt = (pos = 1, val = 3.)
typeof(nt)
>> NamedTuple{(:pos, :val),Tuple{Int64,Float64}} 

# that's a pretty long string to use, we can change this:
nt::NamedTuple{(:pos, :val),Tuple{Int64,Float64}} 
# into this:
nt::NamedTuple{pos::Int64, val::Float64}
# of even shorter:
nt::(pos::Int64, val::Float64)

We can hide some default type names for composable types:

Unions:

u::Union{T1, T2} 
# can be shortened to:
u::{T1, T2}

Unions with Missing, link to the original proposal

u::Union{T,Missing}
# replace with
u::T? 

u::Union{T1, T2, Missing}
# replace with
u::{T1, T2}?

There are some additional options from Pair, Dict and Array constructors I’m not sure how to use, like:

x::[T1, T2]
x::(T1=>T2)

and so on.

Any other ideas?

yuyichao · July 7, 2019, 2:04pm

sairus7:

# into this, :: symbol prevents us from interpreting it like an array initializer
a::Int[] # variable-sized array
# also, for fixed-sized arrays (like MArray from StaticArrays.jl) we can add: 
a::Int[4] # this is 4-element array with constant size

No, all of these syntax are taken.

This too. There were talk about making this {Int64, Float64} but the syntax seems too precious to use for now.

Still valid syntax that cannot be used.

No for the same reason.

In general, none of what you have are “complex types declarations” they are just a single type with one level of type parameter.

sairus7 · July 7, 2019, 2:26pm

I don’t understand, is this syntax planned for some future release, and just cannot be used for now?
By “complex types declarations” I mean more symbols needed to write it up.

Tamas_Papp · July 7, 2019, 3:11pm

This has nothing to do with syntax. It just comes from

const Vector{T} = Array{T,1}

which is a generic approach and available for all parametric types.

As for the rest, you are essentially proposing syntax used for values and function calls to describe parametric types. I don’t think this is necessary, nor is it a good idea: at the moment we have a nice general syntax for parametric types, and you would introduce a lot of special cases.

In general, as a language matures the syntax needs to be touched less and less. This is especially true for Julia, which has very powerful constructs at zero or negligible cost, so little extra syntax is needed. For the rest, you have macros.

StefanKarpinski · July 7, 2019, 3:39pm

There are not different syntactic rules in Julia for “value context” and “type context”. All code evaluates the same everywhere: T[4] means getindex(T, 4) whether it occurs in a loop body or after a ::. That’s what it means that this syntax is already taken: it already means something, whether it seems like a useful meaning or not. For example, you could have defined Int to be a vector of types (it’s not a keyword, just a name that is imported by default), and writing a::Int[4] would declare (or assert depending on the usage) a to be of the type stored in the fourth slot. Like this:

# needs to be in a fresh REPL session that has never used `Int`

julia> Int = [String, Vector{Integer}, ComplexF64, Rational{BigInt}]
4-element Array{DataType,1}:
 String
 Array{Integer,1}
 Complex{Float64}
 Rational{BigInt}

julia> struct Foo
           a::Int[4]
       end

julia> Foo(123)
Foo(123//1)

julia> fieldtype(Foo, :a)
Rational{BigInt}

What it means to say that this syntax is “already taken” is that this code, which uses the proposed syntax, already works (doesn’t error) and means something different than what you proposed for it. Similarly for most of the syntaxes you’ve proposed—they already mean something in “value context”, and since there is no separate syntactic type context, they already mean something everywhere.

One of these is of particular historical interest: the syntax for tuple types once was (Int, Float64). However, because Julia doesn’t have separate syntactic modes or contexts for values and types, that meant that the type of a tuple was a tuple of types. In other words, (Int, Float64) was both a tuple of types and the type of the tuple (1, 2.3). This was a pretty cute arrangement, but it caused a lot of problems for type inference since it was hard for the compiler to know, given something that was a tuple, if it was going to be used as the type of something or as a value. Because of this confusion, the syntax for tuple types was changed to Tuple{Int, Float64} in 0.4 if I recall correctly. Now we can now distinguish the type of (1, 2.3) which is Tuple{Int, Float64} from the tuple of types, (Int, Float64) and even more importantly, so can the compiler.

StefanKarpinski · July 7, 2019, 3:47pm

Also note that while you might be tempted to define getindex(T::Type, n::Int) to return a static vector type with element type T and length n, so that one can use the Int[4] syntax to declare a field to be of that type, this syntax pun has already been used for constructing a vector with element type T containing the value n:

# new REPL session with the default meaning of `Int`

julia> Int[4]
1-element Array{Int64,1}:
 4

julia> @which Int[4]
getindex(::Type{T}, x) where T in Base at array.jl:366

I always feel a bit guilty about this pun, but it’s just so natural that many people don’t even realize they’re indexing into a type object when they do this.

foobar_lv2 · July 7, 2019, 3:58pm

Thanks for the interesting historical tangent!

In this context, I have to ask: Why did we decide against {T1,T2} as a shorthand for Tuple{T1,T2}? Very naively, this syntax looks free (Error: syntax: { } vector syntax is discontinued), natural, and the current Tuple{...} is pretty verbose.

StefanKarpinski · July 7, 2019, 4:04pm

As @yuyichao mentioned, we might still, but it’s a really valuable bit of syntax and we might instead want to use it for something more important than tuple types, which don’t need to be written all that often.

Tamas_Papp · July 7, 2019, 5:11pm

I think that just leaving it as syntax that is parsed, but otherwise free of predefined meaning (which is pretty much the status quo) is also a viable option. Then macros / DSLs can take advantage of it without risk of confusion/punning.

For example, PGFPlotsX.@pgf does this.

sairus7 · July 7, 2019, 5:14pm

Thank you for explanation!
Actually, I’ve started thinking about it when I had to replace a structure with a named tuple as a field inside another structure (didn’t want to declare additional nested structures somewhere), and found that I need a long string like NamedTuple{(:pos, :val),Tuple{Int64,Float64}}. Do you think that at least changing it to NamedTuple{pos::Int64, val::Float64} is not possible eighter?

StefanKarpinski · July 7, 2019, 5:24pm

NamedTuples types definitely need better syntax.

Tamas_Papp · July 7, 2019, 6:29pm

I am not sure about this. I use NamedTuples a lot, but I rarely ever write actual NamedTuple types out explicitly.

yuyichao · July 7, 2019, 8:09pm

No that’s impossible since the syntax already have meaning. If you just want to save some typing, you can use a macro to do the transformation.

StefanKarpinski · July 8, 2019, 12:11am

Yes, it’s not something you have to write very often which is why it’s been left so long but man, the NamedTuple{(:a, :b), Tuple{Int, Float64}} syntax is rough.

Tamas_Papp · July 8, 2019, 5:18am

I don’t know, it kind of grew on me I don’t think it can be made much simpler, and I appreciate the consistency that it is a parametric type like any other.

StefanKarpinski · July 8, 2019, 3:43pm

Here’s an idea. This syntax is available:

julia> T{a=Int, b=Float64}
ERROR: syntax: misplaced assignment statement in "T{a = Int, b = Float64}"

So maybe we could generically make T{a=Int, b=Float64} mean

T{(:a, :b), Tuple{Int, Float64}}

The only part that isn’t entirely generic is:

why are the keys turned into a tuple of symbols while the values are turned into a tuple type?

If the way of writing NamedTuple had been NamedTuple{(:a, :b), (Int, Float64)} instead, then we could just define T{a=x, b=y} to mean T{(:a, :b), (x, y)} and have a nice syntax for this type. This is sort of a “parametric type keyword arguments” idea. If we only allowed types as type parameters, the Tuple type part would be justified, but we allow values as well.

I don’t think we can quite fix this without breaking code that already uses the NamedTuple type, but maybe there could be a special case rule that if all the values are types then it gets turned into a tuple type, but if some of the values are non-types then it gets turned into a tuple. That seems unfortunately complex, however.

StefanKarpinski · July 8, 2019, 4:00pm

Ironically, this wouldn’t have been an issue pre-0.4 when (Int, Float64) was a tuple type

foobar_lv2 · July 8, 2019, 4:01pm

The generic transformation you proposed looks ok to me. After all, ccall is the odd one with (argT1, argT2), and most others (named tuple, dispatch types, Core.Intrisics.llvmcall) use Tuple{argT1, argT2}.

(I understood your proposal as a new lowering rule that transforms T{a=b, c=d} into T{(:a, :b), Tuple{b,d}}, after macro expansion and before inference, in order to leave more syntax for DSLs)

jeff.bezanson · July 8, 2019, 4:06pm

I don’t think writing a type of the form T{tuple, tuple} is all that generically useful. Maybe in the future we’ll have named type parameters instead of just positional type parameters, which could be quite interesting, and would be a better use of the syntax.

NamedTuples have their own needs. Using a Tuple type as the second parameter was quite deliberate and very useful. For example, given NamedTuple{names, T} you can write T(x) to convert to a corresponding tuple type. And many routines (e.g. in the compiler) that know about tuple types can be reused for named tuples by passing T to them. There are not as many operations on tuples of types.

StefanKarpinski · July 8, 2019, 4:06pm

ccall is very old: it predates macros (or it would have been a macro, probably) and it predates the change from writing tuple types as Tuple{T1, T2} instead of (T1, T2).

But the issue here is that you might want to pass values as type parameters, not just types. T{(:a, :b), (1, :foo)} is a valid parametric type but T{(:a, :b), Tuple{1, :foo}} is not valid.

Topic		Replies	Views
Array{T}(initialization,...) syntax Internals & Design array	8	1351	February 26, 2018
Experiment: NamedTuple Syntax w/ Type Declarations Internals & Design syntax , experimentation , namedtuple	3	339	December 22, 2022
StaticArrays 0.12.0 - New static array literal syntax Package Announcements	5	1643	November 4, 2019
How to shorten type definition with many AbstractArray fields? General Usage type	3	445	August 8, 2019
Brevity and StaticArrays Internals & Design	10	1193	October 3, 2017

Short type declarations (syntax sugar)?

Related topics