When a function requires a Vector that contains a concrete type, is it necessary to generate optimized code?

Suppose if you’re writing a function that will sort elements in a vector, perhaps insertion sort.

function insertion_sort!(unsorted_list)
    # Don't worry about implementation.
end 

Performance Tips:

In many languages with optional type declarations, adding declarations is the principal way to make code run faster. This is not the case in Julia. In Julia, the compiler generally knows the types of all function arguments, local variables, and expressions.

I interpret this to mean, provided we use concrete types, we don’t need type annotations or where T, because we’re telling Julia the element types directly.

So if I were to do this:

student_names = String["Ben","Wayne","Tom","Andy"]
numbers = Int32[43,12,25,74,37]
insertion_sort!(student_names)
insertion_sort!(numbers)
  1. Is it correct to say that triggering specialization is unnecessary?
  2. If I use where T, am I slowing down performance because I’m giving it information it already knows?
function insertion_sort!(unsorted_list::Vector{T}) where T
    # Don't worry about implementation.
end 
  1. As you can see, the insertion_sort!() that uses the where T has the benefit of limiting the function to a vector with elements of a specific type. The first version has no such limitation (I could pass a String instead of a Vector.

    If you wanted to limit the argument to just a Vector of a specific type, without bothering performance, what is the correct approach? Do I use a Docstring, and if you fail to follow the required arguments, it’s the problem of the caller?

    Doing insertion_sort!(Vector{Int32}) or insertion_sort!(Vector{String}) isn’t ideal, you’ll be repeating a lot of code because the types are different.

1 Like

I think you are misunderstanding. In function arguments it’s not a case of type checking but dispatch. Julia needs to choose and compile a specific function whether you tell it to explicitly or not.

function f(x::T) where {T<:Any}, function f(x::T) where {T}, function f(x::Any) and function f(x) are equivalent to each other. Specifically, there is no requirement whatsoever for T to be a concrete type – try f(x::AbstractArray{T}) where {T} = println(x); f([1, "a"]).

What the docs mean here is that there is no advantage to

f(x::String) = println(x)
f(x::Int64) = println(x)

over

f(x) = println(x)

, it only clutters the dispatch table meaning a performance loss.

In Julian philosophy, you should try to restrict the arguments as little as possible: we are not trying to limit which objects a function can be used on, we are specifying a type that encompasses everything for which a specific method is reasonable.

For instance, there’s no reason to not let the user try to sort ["a", 3], the compiler will be very helpful in letting them know there is no method matching isless(::String, ::Int64).

If you need to check that an array is uniform, there is isconcretetype(eltype(x)), but I doubt you will need this much.

3 Likes

Does it? Until the method for each type is not called, they are not compiled anyway. When they are, they go to the dispatch table. Does it make any difference defining them or not? The only additional clutter I can see is on the code itself.

2 Likes

Yeah, you’re right. Then I don’t know what the docs mean. Interpreting the extra code can hardly be a measurable cost.

1 Like

My problem stems from the fact I’m attempting to accomplish two things:

  1. Write efficient code.
  2. Limit the argument types that can be passed, so we don’t run into an unexpected error.

From my understanding, using type annotations isn’t necessary unless you have a good reason to do so (limiting the argument types in a function doesn’t seem like it’s a good reason).

From what I understand the efficient way to write it would be:

function insertion_sort!(unsorted_list)

and let Julia handle any potential errors from incorrect argument types, correct?

The efficiency is not related to the type annotations, one way or the other.

What would be the efficient way to write a function that accepts a vector as an argument?

This is the overall question I’m asking.

function foo(x::AbstractVector)
    # ...
end

That’s the way I normally would write it, but what’s confusing is in the documentation, it says type annotations are unnecessary when you need to write fast code. There is always a back and forth over when to use it.

In many languages with optional type declarations, adding declarations is the principal way to make code run faster. This is not the case in Julia. In Julia, the compiler generally knows the types of all function arguments , local variables, and expressions.

Another section of the Performance Tips says it can hinder performance.

Is their a general guideline to follow when you should use type annotations? Is it better to let Julia throw an error when the incorrect type is passed?

Yes, the type annotation in my example isn’t for efficiency, it is to fulfill this requirement:

of your question.

Type parameters for function signatures are just used for dispatching to the correct method, e.g. you might have one method for vectors and one for numbers:

foo(x::AbstractVector) = # Something with x as a vector
foo(x::Number) = # Something with x as a number

If you only have one method you can of course leave them out. But if you know that you require the input to be a vector, then using ::AbstractVector is good. Users will then get a MethodError when they try to call your function with something else. It also doubles as documentation, it is becomes very clear what the expected input types are when reading the code.

1 Like

If we in fact have only one function, would using the type annotation for the function argument (as you’ve shown) cause a slow down or hinder performance (to use the documentations words)?

Would it be better to use a DocString?

No, the documentation doesn’t talk about type annotations on function signatures there.

Why not both? There is no reason to let numbers through if you know the method will only work for vectors, for example.

If you mean

Type annotation will not enhance (and can actually hinder) performance if the type is abstract, or constructed at run-time. This is because the compiler cannot use the annotation to specialize the subsequent code, and the type-check itself takes time.

That’s not the same thing. They are talking about annotating objects outside an argument definition, with an abstract type, so

function f(x::Vector{Any}) 
    g(x[1]::Integer) # this line
end

Here, Integer is an abstract type, so the annotation doesn’t help specialize the code. But the compiler still has to spend time verifying that it holds, so it’s a hindrance.

1 Like

Would this hinder performance?

function  main()
    # Note the type annotation and concrete declaration of a vector.
    numbers :: AbstractVector = Int32[1,2,3,4,5]
end 

So writing

insertion_sort!(unsorted_list::AbstractVector)

even if it’s only one fuction is the cleanest way to write that function? I’m not hindering performance, correct? So it now becomes, as stated above, documentation.

Yes, but not very much. Type checking is not that expensive.

More importantly, it doesn’t do anything.

Yeah, that looks fine.

1 Like