How is it that new Julia programmers tend to abuse type annotations?

Well yes, when you know what you are doing then using a type restriction is in fact a nice feature, and you might have a lot of reasons to do that. But my concern is that people just throw type restrictions all around assuming they would improve performance, which they do not: I am not claiming that type restrictions themselves are bad, simply that this mistake tends to produce too many of them and induces later problems (like with the aforementioned ForwardDiff classic case).

5 Likes

Perhaps one actionable output of this process could be better documented guidance about when it makes sense to add type annotations in functions? Some additional ideas on top of what’s been mentioned already:

  • For production (e.g. package) code, methods may want to constrain their type signatures if they’re unsure the functions they call can support arbitrary types. This is the AbstractArray case @mkitti mentioned.
  • Conversely, methods should not over-constrain their types if they know all the functions they’ll call can support those types:
# unnecessary
square(a::Int) = a * a
square(a::Float32) = a * a
square(a::...) = a * a
...

# `*` is defined for all number types
square(a::Number) = a * a
  • One challenge here is that not all types/functions have well-defined contracts around what operations/types they support. However, this is something we can work on as a community and has been talked about enough already that I’ll leave it at that.
  • When one needs to check for some invariant that can’t be represented in the type system or would complicate dispatch, there’s nothing wrong with a dynamic check! Throwing an error early is better than letting one bubble up from deep within a method’s call stack at some unintelligible location. When dispatch is our hammer, many of us have been guilty of seeing everything as a nail. So while this point isn’t new, I see this lack of runtime error checking enough in the ecosystem that it bears repeating.
4 Likes

That should just be Any. Strings, for example, support multiplication, too, and the square makes sense for them, too.

1 Like

The last missing piece here may be that Julia works differently from many statically typed, compiled languages in that it’ll specialize on types as it encounters them at call sites, rather than what is declared in a method definition. One analogy with C++ is that (almost) all arguments in Julia functions are auto, which means they can be specialized upon without specifying explicit concrete types.

Methods · The Julia Language in the manual talks about this briefly, but one loose heuristic I found helps is to insert implicit type parameters in your head. Thus, this

f(a, b::AbsType, c) = ...

Would become

f(a::A, b::B, c::C) where {A, B <: AbsType, C} = ...

And it’s much clearer why the compiler could specialize for different combinations of A, B and C despite the definition of f not providing any concrete type bounds for them. It’s a heuristic because the compiler may choose to not specialize on them, but that can also apply for arguments with provided types.

1 Like

I hope it was clear that this is a purely illustrative example.

1 Like

IMO:

  1. Use type assertions (on an expression within a function body) abundantly, whenever it makes sense. When calling another function, say f, it often makes sense to assert that the obtained value is of correct type: f(arg)::T.

  2. Try not to use type annotations when declaring local variables.

  3. The dispatch/method case is much more nuanced and trickier. Some notes:

    a. Prefer traits to abstract types. In fact, try not to declare or use any abstract type unless necessary.

    b. In the case of private API, avoid unnecessary type constraints. For public API, though, it often makes sense to restrict the accepted types for the parameters of a method.

1 Like

It could be a good feature for the language server.

2 Likes

How about the Argument-type declarations section of the manual, which was written specifically to provide such guidance? What improvements would you suggest?

Would it be possible to put this text inside a box like the text about Shared memory between arguments?

In general, you should use the most general applicable abstract types for arguments, and when in doubt, omit the argument types . You can always add argument-type specifications later if they become necessary, and you don’t sacrifice performance or functionality by omitting them.

I believe that this type of practical aspects should be highlighted more.

3 Likes

Those are specific cases that make sense but the general cases don’t.

  • string is different from FloatXX because only the latter is a type, and a concrete one at that. Upper cased types have associated lower cased functions that are not desired to always return the type. An API-compliant string method could sensibly return other subtypes of AbstractString. It could be even less restricted like Iterators.reverse. Generally, Reverse(Reverse(x)) wraps doubly, but reverse(reverse(x)) sensibly unwraps to x.
  • People definitely want concrete type constructors to only return that concrete type or error; the varying return types are relegated to the lower cased functions. However, there’s no 1 concrete return type for abstract type constructors, including iterated unions like Complex. Guaranteeing that AbstractFloat returns an AbstractFloat doesn’t help much.

At best we might want return types for type constructors, though it’d only narrow down enough for concrete ones like Complex{Int}, and maybe we can have a manual option for arbitrary callables, not sure if function foo::String end is a proper appropriation or sufficiently restricted to concrete types (or perhaps small Unions).

It’s also worth pointing out that this is at best an attempted type conversion and an assertion of the return type in subsequent code. This won’t eliminate the runtime dispatch of the call, and you can’t catch a failed convert in infinite call signatures ahead of time. Statically typed languages restrict polymorphism to pull this off.

Concatenation, not multiplication, they just share an operator. It doesn’t make much sense to say a string can be “squared,” even if the underlying operator works.

1 Like

Another important consideration is invalidations.

Let’s try the untyped version first. We get invaldations. Recompilation will be forced.

julia> f(x) = x
f (generic function with 1 method)

julia> function g(v)
           s = 0
           for e in v
               s += f(e)
           end
           return s
       end
g (generic function with 1 method)

julia> g(Number[5,3,2,1])
11

julia> using SnoopCompile, AbstractTrees # Use SnoopCompileCore in a real scenario

julia> invalidations = @snoopr begin
           f(::Float64) = 2.0
       end
6-element Vector{Any}:
  MethodInstance for g(::Vector{Number})
 1
  MethodInstance for f(::Number)
  "jl_method_table_insert"
  f(::Float64) @ Main REPL[5]:2
  "jl_method_table_insert"

julia> trees = invalidation_trees(invalidations)
1-element Vector{SnoopCompile.MethodInvalidations}:
 inserting f(::Float64) @ Main REPL[5]:2 invalidated:
   backedges: 1: superseding f(x) @ Main REPL[1]:1 with MethodInstance for f(::Number) (1 children)

julia> print_tree(trees[1].backedges)
InstanceNode[MethodInstance for f(::Number) at depth 0 with 1 children]
└─ MethodInstance for f(::Number) at depth 0 with 1 children
   └─ MethodInstance for g(::Vector{Number}) at depth 1 with 0 children

Now let’s try a typed version. No invalidations, no recompilation!

julia> f(x::Int) = x
f (generic function with 1 method)

julia> function g(v)
           v2::Vector{Int} = v
           s = 0
           for e in v2
               s += f(e)
           end
           return s
       end
g (generic function with 1 method)

julia> g(Number[5,3,2,1])
11

julia> using SnoopCompile

julia> invalidations = @snoopr begin
           f(::Float64) = 2.0
       end
Any[] # no invalidations, no recompilation!
9 Likes

Though if it was important to restrict the iterable to a Vector{Int}, it’d be more practical to restrict the method to g(v::Vector{Int}) and do the eltype conversion separately. After all, Number elements can’t all be converted to Int, so the second version is more prone to runtime errors in exchange for resistance to invalidations, broadly speaking it is less flexible.

Relevant PR: improve return type inference for `string` by nsajko · Pull Request #52806 · JuliaLang/julia · GitHub

Relevant Discourse topic (potential Julia bug?): Why is `Base.return_types(String, Tuple{AbstractVector{UInt8}})` so pessimistic for this method?

@joa-quim @Benny

FWIW, when I’m creating code for work – not library development – I type-annotate almost every argument in my methods where it is feasible. I strongly prefer method errors that tell me the expected and actual types at the level of my code to more cryptic errors further down the call stack.

I mention this because, when I first started using Julia, I read that I “shouldn’t over annotate” and took that at face value. I think there are plenty of good reasons to reduce the generality of your code. Just not if you are trying to share it with others.

17 Likes

Yeah the problem with “don’t over-annotate” is it doesn’t quite get across where the lines are. If you need your method, or even the whole function, to only be dispatched on particular argument types, annotating is the appropriate measure. The tip is intended to inform people that restricting dispatch with argument annotations often doesn’t help the compiler optimize a call. Specifying types for fields, parameters, variables, or right-hand expressions actually do things that help (or hurt) compiler optimizations. It’s probably fair that 1 tip is no substitute for learning the language, but it’s also probably possible to make a beginner’s tutorial that lays out how types are communicated to the compiler in one smaller place. Toy example code in articles intended for highlights rather than teaching tends to leave out annotations entirely, which really misleads people into thinking Julia is alternate Python.

11 Likes

That would be a fantastic idea. I think Julia uses types in such a unique way that it takes some time to get used to them. Type annotations in other programming languages are mandatory or useful, but never detrimental as they can be in Julia. In the future, there might even be exercises similar to these Python type exercises.

1 Like

0.- Morning. After a year learning and working with “Julia” every day, I continue being a “newcommer”… at this time, and I am afraid that some years more…

Like a “newcommer” I wonder…

Q1.- Could be that we are thinking that we have better performance by explicit “type annotations on parameters on function” because we can find this expresion:

“The Julia compiler is able to generate efficient code in the presence of Union types with a small number of types [1], by generating specialized code in separate branches for each possible type.”

Types · The Julia Language

Q2.- Could be that in many situations we need to be sure about what Types Julia is using in all operations into the function. Because allways we need to be certain of the results of the calculations.

An easy example:

a=123456789123
println("es ",a,       " y T d a =       ", typeof(a))
r2=a*a
println("es ",r2,      " y T d a*a =     ",typeof(r2))
r3=a*a*a
println("es ",r3,      " y T d a*a*a =   ", typeof(r3))
r4=a*a*a*a
println("es ",r4, " y T d a*a*a*a = ", typeof(r4))

The result here is an (error by) overflow ( in fact without any notice ) because Julia continue using the same “default type”.

es 123456789123 y T d a =       Int64
es 4568175676801474313 y T d a*a =     Int64
es -8268922784655733861 y T d a*a*a =   Int64
es -8138388883848516015 y T d a*a*a*a = Int64

Assertion.- You should be careful with the recommendations because you do not know all the calculation scenarios in which we are working. We may be “newcomers” in Julia but not in the details of Scientific Calculus.

Thanks anyway.

1 Like

I appologise if the “newcommer” tag offended you, I was including myself in it by the way.

Once again, I do not assume people are stupid, and I agree that type assertions are a useful part of the language, I was just noting that people tend to use them with wrong assumption about what they will do, people that i tagged “newcommers”, maybe a bit quickly indeed.

I think this sentence is referring to the ability of Julia to split unions, which has nothing to do with what we are talking about here – There is indeed a potential for confusion there, maybe we could fix the docs.

I think the thing with type assertions is mostly just that it’s a bit of a Pascal’s wager but for performance. Someone who is new to the language often knows that types are an important part of how julia is able to be fast, and they’ll also know that there are certain circumstances (e.g. structs or type instabilities) where a type annotation can have a large beneficial performance impact.

Hence, if the user doesn’t know precisely what circumstances the type annotations help, and what circumstances it’s superfluous, it’s pretty natural to just respond with “well, I’ll just put type annotations everywhere, it’s not that hard to do and maybe it’ll speed things up”.

23 Likes

Don’t the latest developments in Cthulhu.jl and its integration into the VS Code extension solve most of the problems newcomers face with Julia types?

7 Likes