As a beginner I’m slightly unsure of when I should be annotating my code with types. I’ve read the section on types in the Julia manual and I’m happy with the basics (type hierarchy, parametric types, avoiding abstract types, and the importance of type stability) but I’m still slightly unsure how I should using type annotations in practice.
For instance, from the QuantEcon website:
You will notice that in the lecture notes we have never directly declared any types. This is intentional both for exposition and as a best practice for using packages (as opposed to writing new packages, where declaring these types is very important). It is also in contrast to some of the sample code you will see in other Julia sources, which you will need to be able to read. To give an example of the declaration of types, the following are equivalent. While declaring the types may be verbose, would it ever generate faster code? The answer is almost never.
function f(x, A)
b = [5.0, 6.0]
return A * x .+ b
end
val = f([0.1, 2.0], [1.0 2.0; 3.0 4.0])
function f2(x::Vector{Float64}, A::Matrix{Float64})::Vector{Float64}
# argument and return types
b::Vector{Float64} = [5.0, 6.0]
return A * x .+ b
end
val = f2([0.1; 2.0], [1.0 2.0; 3.0 4.0])
However, I have seen other sources that suggest that annotating code with type information is the key to performance in Julia. For instance, I’m told that if I preallocate a vector I should always include type information.
I’m sure there’s a lot I’m not fully understanding but If someone could provide some tips or general guidance on these issues, it would be very helpful for beginners like me.
Well, yes and no.
Yes, it is important to preallocate a vector with a concrete element type for performance.
However, you can do it without specifying the argument types in many cases, as there are helper functions: similar(x) - creates an uninitialized array with the same type and the same dimensions as x typeof(x) - returns the type of x, can be used as a constructor eltype(x) - returns the type of elements in x.
For example, those expressions are equivalent for an x::Vector{Float64}:
#1
y = Vector{Float64}(undef, length(x))
#2
y = typeof(x)(undef, length(x))
#3
y = similar(x)
#4
y = resize!(eltype(x)[], length(x))
In my personal opinion, it is good to use type annotations, but usually you shouldn’t specify concrete types. E.g., foo(x::Vector{Float64}) is typically not what you want, foo(x::AbstractVector{<:Real}) is more like it. And when the number of arguments is greater than 2, you can easily see why one is encouraged to not use type annotations at all
It is also worth noting that typing in Julia is not the same as typing in C++. In C++, foo(int n) happily accepts all types for n which it can convert into int (i.e., long, char, size_t, even float are all fine, although you may opt into having a warning for the last case). In Julia, foo(n::Int64) means only Int64s are allowed, no Int32s, BigInts, UInt8s etc. As such, annotating function arguments with concrete types might not even reflect the programmer’s intent properly.
Side-note: as a Julia beginner I find the answers and examples given here extremely useful! I would love to see more of these kinds of questions and responses in the manual or faq.
A great first step would be for some intrepid member of the community to categorize all the questions on the forum each month, track the most frequently asked and make sure the best answers to those end up in a special doc.
This is one of the rare situations where I think my view is different to most people here, so follow my advice at your own risk
I personally like to include type annotations on both the inputs and outputs to most of the code I write. This is for the simple reason that I find that when I come back and look at a piece of code three months later, the type annotations are very helpful in reminding my brain how a particular bit of code is structured.
It is worth emphasizing that a lot of the code I write is just for me, so keeping things as general as possible for the sake of others who might use my code is less of a priority. Having said that, I do have a few registered statistical packages, and even in those packages I always include annotations, albeit I’ve gone to some effort to make sure the annotations are as general as possible, and (for example) allow for things like Missing when working with number types.
One aspect I find relevant is to think about the way code will error given different inputs. Sometimes the author assumes inputs will be of a certain type, then I think it should be type annotated. Any other type working with that function would be accidental, and errors with other types could possibly happen way later in the stack. This can lead to frustrating bugs.
When someone writes a generic function, I then think it’s again important to check the generic assumptions with meaningful error messages if they aren’t met. For example checking if some type can be iterated, or its length is known or it’s a bits type. Those can be traits sometimes.
Anyway, my point is, the more generic a pipeline, and the less assumptions are checked in the code with meaningful explanatory error messages, the more likely you are to hit bugs deep down the stack without knowing that you violated an implicit assumption further up.