On type annotations

I wonder if there are any risks or issues with type annotations. In short, together with colleagues I am writing a coding standard. All Julia code at my company must be implemented following this standard. Some packages contain hundreds of structs. So it seems that annotating functions and variables is a good idea for readability and maintainability of code. The annotation guarantees that the variable will not change type as long as it lives. The type annotation of functions guarantees the return type. Both types of annotations improve maintainability and readability of the code. I know that under some circumstances convert is called. We intend to use static code checking techniques to make sure that functions are variables are type annotated, so there will be no escaping of annotations for developers.

Of course, type annotations of arguments of functions reduce re-usability of such a function for other types, but that does not concern us.

To be explicit, we have in mind to type annotate functions, function arguments and variables like so:

function f(value::Float64, obj::MyStruct)::OtherStruct
    vec::Vector{Float64} = [1.0, 2.0, 3.0]
    other::OtherStruct = calculate_other_struct(value, obj, vec)
    return other
end

If the type annotation is hard-coded as the inferred type, we can’t imagine there is any problem. Is there something we miss?

1 Like

Not sure whether this is an issue for you or you’re referring to something else but just to be sure:

julia> function f(value::Float64)
           vec::Vector{Float64} = [1.0, 2.0, 3.0]
           vec = Int32.(vec)
       end

f (generic function with 1 method)

julia> f(1.0)
3-element Vector{Int32}:
 1
 2
 3

Are you sure? If you make input types Float64 you cannot use automated differentiation any longer (for which you need dual numbers) which comes in handy when you do machine learning or optimizations, for example.

And if you have vec::Vector{Float64} you cannot pass an SVector any longer. But an SVector is much faster for small vectors…

6 Likes

You are not observing the type of vec there but returning the value of the right hand side of the last assignment.

4 Likes

Ah of course, thank you. To correct:

julia> function g(value::Float64)
           vec::Vector{Float64} = [1.0, 2.0, 3.0]
           vec = ["a", "b", "c"]
           return vec
       end
g (generic function with 1 method)

julia> g(1.0)
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64

The return type must be annotated too in the function signature. What I did not mention in the question, is that a return statement with a variable will also be mandatory. So the function would be implemented according to the standard like so:

function f(value::Float64)::Vector{Float64}
           vec::Vector{Float64} = [1.0, 2.0, 3.0]
           vec = Int32.(vec)
           println(typeof(vec), vec)
           return vec
end


r = f(2.0) # Annotation intentionally omitted for evaluation
println(typeof(r), r)

with output

Vector{Float64}[1.0, 2.0, 3.0]
Vector{Float64}[1.0, 2.0, 3.0]

IMHO, the annotations for the input/output of the functions are ok. They reduce flexibility, but are actually useful for finding errors.

The annotations *inside* the functions I don't find appealing. Something like:
  vec::Vector{Float64} = [1.0, 2.0, 3.0]

only guarantees that one doesn’t assign to vec, at that line something that is not a vector of Float64. But doesn’t provide any guarantee about what vec will be inside the function, and just adds a lot of boilerplate.

(to increase the utility of type annotations of function input and output, of course, try to use the smallest functions possible)

2 Likes

We will not be writing code for the community or so, so re-usability is probably not such an issue. Our data structures (and structs) tend to be rather larger than I have seen elsewhere in my career.

Are you sure you never want to use your functions in the context of an optimization algorithm? Never with distributions because you never have to determine error margins?

You could use the type assert in the return line too if you want to explicitly check without the convert, and I think jet will still pick up the type when checking the code. There is an old thread that looks at the performance impact of the type assert. Like this:

return other::OtherStruct

1 Like

This looks what we want to prevent indeed. Such type changes make it very hard to reason about code in lengthy functions.

There is a simpler usability issue with annotating with Vector{Float64}, which is the impossibility of passing slices:

julia> f(x::Vector{Float64}) = x
f (generic function with 1 method)

julia> x = rand(3);

julia> f(@view(x[1:2]))
ERROR: MethodError: no method matching f(::SubArray{Float64, 1, Vector{Float64}, Tuple{UnitRange{Int64}}, true})

(that can be solved using AbstractVector{Float64} - which also solves the possible uses of StaticArrays).

3 Likes

I don’t follow. I think the annotation makes sure that vec remains of type Vector{Float64} until garbage collected. Can specify your posting with a code snippet?

1 Like

Sorry, you’re right (I confused it with the global scope, where the annotation is just a conversion at the typed line). Another pattern that might be interesting is to use local, to not mix annotations with assignments.

julia> function f()
           local y::Vector{Float64}
           y = [1.0, 2.0]
           return y
       end
1 Like

This sort of annotation I would advise against. It is redundant and non-idiomatic, essentially line-noise, at worst causing performance loss or bugs.

If you want to assert a type like this, do it on the right hand side

vec = foo()::Vector{Float64}

Also, it just seems lik bad taste, imo, to use a type assertion on a literal value that is guaranteed by definition to produce the correct type, like eg.

str::String = "hello" 
3 Likes

Our company is large, so I can never say never. What specific functionality or package did you have in mind that will be blocked by the type annotations?

Any optimizer that uses automatic differentiation, e.g. Introduction · JuMP , but many there are many more. Anyone who tries to use distributions, e.g. Getting Started · Distributions.jl . Anyone who tries to use physical units, e.g. Home · Unitful.jl .

1 Like

I just realized that it is not redudant:

julia> function f()
           y::Vector{Float64} = [1.0, 2.0]
           y = ["a", "b"]
           return y
       end
f (generic function with 1 method)

julia> f()
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64

julia> function g()
           y = [1.0, 2.0]
           y = ["a", "b"]
           return y
       end
g (generic function with 1 method)

julia> g()
2-element Vector{String}:
 "a"
 "b"
1 Like

Indeed, not redundant at all. It becomes very hard to write type unstable code too when all types are made explicit and like said, an annotated variable can not change type any more.

Can you specify the benefit of annotating on the right hand side of a function call? I guess this annotation will have to be repeated for any call to this function which seems to make the caller responsible for part of the contract between caller and callee. This becomes a bit complicated in nested calls, I think.

This is perhaps debatable, especially the “readability”. The option to omit cluttering type annotations is a feature of Julia, which is not available in other “performant” languages like C++. I would not remove it via a coding standard. The issue with type annotating everything arguably is readability.

More important is perhaps to invest in testing, especially checking whether @inferred types are ok. This would substitute annotating function types and in addition ensures efficient code, which is important in complex projects.

4 Likes