A function return type of `::AbstractArray{AbstractString}` causes a performance penalty?

For clarity reason, I sometimes annotated return type.

However, I stopped when I realized this :

Given:

foo()::AbstractVector{AbstractString} = [“hello”]
foo2() = [“hello”]

fii(v) = println(eltype(v))
fii2(v) = @inbounds v[1]*v[1]

then:

julia> fii(foo())
AbstractString

julia> fii(foo2())
String

and

julia> using BenchmarkTools

julia> @btime fii2(foo())
198.081 ns (3 allocations: 160 bytes)
“hellohello”

julia> @btime fii2(foo2())
82.447 ns (2 allocations: 96 bytes)
“hellohello”

It seems that return type annotation can affect type information propagation and induce runtime penalties. Am I wrong?

(I am using Julia v1.7.1)

You can change the code to

foo()::AbstractVector{String} = ["hello"]

to match the performance of the un-annotated version. Your original code converts an array of concrete type String to an array of abstract type AbstractString, which causes a performance penalty. So return type annotation is sometimes not just an “annotation” but can trigger implicit conversion; a type error is only produced if this conversion is impossible.

1 Like

@greatpet Yes, I understand. However I generally used return type annotations when I wanted to clarify my “interfaces”. In such context, I didn’t want to prematurely restrict the function to return a vector of String.

Maybe:

foo3()::Vector{T} where {T<:AbstractString} = ["hello"]

can express more precisely the interface.
(and maintains performance)

3 Likes

That’s a good idea.

foo4()::AbstractVector{T} where {T<:AbstractString} = [“hello”]

also seems to work without penalty (at least for this example)

julia> @btime fii2(foo3())
81.278 ns (2 allocations: 96 bytes)
“hellohello”

julia> @btime fii2(foo4())
81.377 ns (2 allocations: 96 bytes)
“hellohello”

julia> typeof(foo4())
Vector{String} (alias for Array{String, 1})

That’s right. Converting from Vector to AbstractVector has no actual effect, so the annotation becomes just a type assertion.

That’s not really what return-type annotations are for. Just use comments or docstrings.

disagree on this. comments are not checked, type annotations are.

3 Likes

I was responding to a comment about using return-type declarations to “clarify”, i.e. document.

And return-type annotations are “enforced” by calling convert in Julia, which means that they aren’t necessarily a validation mechanism — they can hide a type-instability.

1 Like

Given that AbstractString is an abstract type, what does it mean for a String to be converted to AbstractString? Presumably in memory these have the same representation. Surely it is more likely that the type specification of AbstractString prevents some compiler optimizations - perhaps function call inlining? (On the basis that AbstractString could have multiple implementations and therefore there is not a single function (memory address) which can be guaranteed to exist.)

Note that this is specifically about vectors of AbstractString — and specifically Vector{AbstractString}. This is a very different object than a Vector{String}, even if the contents are the same. In particular, you can change the contents of a Vector{AbstractString} to include non-Strings, but you cannot do that with a Vector{String}. The latter can only contain Strings.

The vector itself needs to be able to know about and handle that that possibility. That’s why the convert isn’t a no-op. The container itself must be different. It’s also why downstream uses aren’t as optimized — any code that retrieves elements from it must be prepared to handle any AbstractString.

This is different than annotating f()::AbstractString = "hello" — that will be a no-op and won’t pessimize any code. In fact, there’s no object that is just an AbstractString — everything in Julia is a concrete type. You can, however, have a container like a vector that is allowed to hold any subtype.

That’s what’s happening here.

4 Likes

You’re misinterpreting the quote. This is about conversion of array types, the strings are just elements of the array.

1 Like

I’ve realized what I wrote was poorly worded. This is what I meant by

… meaning Vector{String} not Vector{AbstractString}.

Basically - it’s to do with dispatching functions. If the whole vector contains the same type there is no lookup at runtime required to dispatch the correct method.

I see - so no actual conversion is being done?

Or at least, ["hello"] which is a Vector{String} is converted to a Vector{AbstractString} but the elements are just memcopy’d, and they (the elements) remain at runtime, still of type String.

There will never exist some object x in Julia for which typeof(x) === AbstractString. Full stop. Doesn’t exist. Can’t exist.

When you convert an Vector{String} to a Vector{AbstractString} you’re creating a new container. It has exactly the same elements as the old container. Those elements have exactly the same types — they’re still all String. But that new container has a new behavior: you can store things like SubStrings in it now, too.

5 Likes

Yes… I think you’re just agreeing with me at this point…

See here …

This.