Is there a reason ***not*** to explicitly define return types?

Allow me to summarize the findings of my research so far:

This is true, however I do not see that this is a reason to prefer not to explicitly define the return type for a function.

Actually, this is not quite correct. The above linked post specifies the type of the return value incorrectly, and it is this incorrectly specified return type which results in a type conversion taking place resulting in a performance penalty. If the return type is specified incorrectly, we should not be surprised if runtime performance degrades. From this, I concluded that this is not a reason to not specify the return type.

This is true, however I don’t think that implies that the return type should not be specified. It is just a statement that Julia doesn’t dispatch on return types in the same way that it does dispatch on argument types.

Possible advantages of including the return type information:

  • Specifying the return type is a good way to provide documentation to users of your API (aka functions)

While it is possible to write the type of the return value in a docstring, this does not have the same effect of using the type system to specify it, which is harder to ignore.

A discussion about specifying the types of arguments to functions:

In contrast to the above, I do not think that the types of arguments to functions should be explicitly specified, because this limits the scope of application for your algorithm (aka function). (But this could be behavior you want - see below.)

There are two exceptions to the above, as far as I can see:

1

  • Multiple dispatch. If a function implementation should have different logic depending on the type of the argument(s), then the types need to be specified.
  • This is nothing new or special. It is simply a statement as to what methods are in Julia. It’s just the multiple dispatch system.

2

  • To restrict the range of types which can be used when calling a function
  • To provide documentation to the user of your API

It seems kind of obvious that a generic algorithm (function implementation) will not work for all types.

For example, some function which performs math operations (like taking the sin of something, or something involving multiplications) is unlikely to be valid when a DateTime type is passed in.

You might argue in order to increase robustness and make it easier for clients to understand how to use your API, the type should be documented. Just as before, it would be better to document this information in the type system rather than a documentation comment, because it is harder to ignore.

This argument is very similar to the argument for specifying the return type.

Do consider there may be less obvious cases. For example, almost anything can be represented using a String type. You can serialize most things, and create most things by parsing a String. Things involving dates and durations can also be a bit tricky. What units of time should the duration be? Is it seconds? Something like CompoundPeriod or Period? If this code is close to some code which reads/writes to file, should the string representation of a duration be used?


Does anyone disagree with anything I have written here. I would be quite happy to be proven wrong if my reasoning is not sound…

2 Likes

It is strange that you propose to not specify input types and at the same time specify the output type.

How would you even define output type without knowing the inputs?

Or, do you propose to specify loose type boundaries? Like

square(A::AbstractMatrix)::AbstractMatrix = A * A
square(x::Number)::Number = x * x

In that case, I’m not sure if it’s useful in function header. In docstring? Yes. In doctest? Absolutely. Those are meant for humans. The header is for the compiler, and compiler does not need that information.

Another reason to not specify return types is “opaque types”. Basically, you specify properties which the return type would satisfy, not the type itself. Like, do you need to know exactly what eachsplit returns?

You raise a good point. I should have been more explicit - let me edit my post because otherwise I agree it makes no sense.

I spoke too soon. I have two further points to make.

The point you raise I think is quite useful. I think this gives a concrete answer to the question.

  • If you specify the types of (all) the input arguments then you should specify the return value, because it can be only one thing, so you might as well document it for your users
  • If you specify the return value you must specify the return type

The second bullet is not quite right. By specifying the return type what you are doing is adding a constraint which says

whatever the input types are for this function, they must produce the type specified by the annotated return type after the operations specified by the logic of the function are completed

Consider a simple example.

function convertToString(x)::String
    return String(x)
end

It is a trivial example, but what it says is x can be anything, but after the logic contained inside convertToString is applied to x the return type must be a String.

So: you could pass any types to convertToString, and the specification of the return type will produce an error at runtime if the function doesn’t produce the correct type. This could be a useful error catching behavior just as limiting the possible function input types can be.

Feel free to tell me I am talking total nonsense if I am.

The return type syntax is a feature with a very specific behavior. It adds a convert and an assert to all returns from the method body. You can use it if it’s that’s helpful for your methods! You can even choose to demand that folks use it in your codebases to maintain a consistent style. But you can also decide it’s not universally helpful, and just use it in some places. For some methods it might only serve as line noise that just gets in the way of the important things you want to pay attention to.

11 Likes

It might be difficult to describe the return type as a function of the input types. Take the function,

f(x, y) = x * y

For numbers, the output will typically be a number, and for standard numbers it will have the type promote_typeof(x,y), so it can be described in the header. But, then f('a', 'b') is a String, and so is f("a", "b"). How all this can be specified, I don’t see. Such things can happen whenever the method signature contains abstract types. You may not know how to write down the type of the output. So, perhaps if the signature only has concrete types it would be possible to write down the output type in a concise manner.

4 Likes

This is technically impossible. We can document that a function should have a particular return type. We can annotate a method with a return type, but it only adds a convert and typeassert step, so if the type conversion fails, then we fail to return at all. We could annotate every method with the same return type, but nothing stops another method with a different or no annotation from being defined.

The reason return type annotations are used sparingly is because of how generic Julia is. All functions we can define are generic functions, it’s even printed that way. The most generic a return type annotation can get is to be a parameter that matches the annotated parameter of an argument, and that often isn’t flexible enough. More generic return types can instead be documented very loosely, with a description (“an iterable”) instead of a bona fide type.

A more useful case for designating a return type is for call signatures varying the callable, in other words various functions taking a particular set of concrete input types and returning the same concrete output type. There’s partial support for that in FunctionWrappers.jl, but there’s some rough edges there.

5 Likes

Great feedback - thanks!

1 Like

I may have been too concise previously.

What I wanted to emphasize is, if you write a generic function, and often you do, you (aim to) write it in such a way that it’s compatible with some types you know of, and hopefully with some types other people designed you don’t know of, and with some types not even written yet. As such, you may not know in advance what the return type would be and how it is derived from the input types.

Even in a simple case square(x) = x * x
You may think that

function square(x::T)::T where {T}
    return x * x
end

would be a correct definition, but let’s try

julia> using Unitful

julia> function square(x::T)::T where {T}
           return x * x
       end;

julia> square(1u"m")
ERROR: DimensionError: m and 1 m^2 are not dimensionally compatible. # Whoops!

So, basically, specifying return type only makes sense for methods with concrete input types. Once you go generic - there are simply too many edge cases.

9 Likes

I do agree with your points here. I also think it may be worth sharing a further thought I had which again compares two different contexts.

When you want to write some generic algorithm, you are often working with numerical types. Or at least, a group of closely related types which interop well together.

For example, integers and floating point numbers are “basically the same” if you are willing to accept a certain loss of precision in your calculations. I mean this in a very loose sense.

You may have algorithms which are generic across scalars, vectors, matrices, objects with even more abstract dimensions. All different types, but the algorithms are the same.

Contrast this to something which is much closer to what you might describe as systems programming. Here’s a random example, pulled from something I have been working on.

struct ConfigurationFile
    df::DataFrame
    date::Date
end

This type is not part of a collection of related types. It is a thing which exists entirely on its own.

It might have some functions to operate on the internal data. Possibly to serialize and deserialize, and such. In these cases the functions (“algorithms”) are (probably) not generalizable things. In some cases they might be.

I thought it was probably worth raising the distinction between these two “domains” of programming.

Certainly it is the case that specializing many mathematical functions to Float64 makes little sense.

U mena imprecisely?

A related topic:

Don’t want to be harsh, but it’s difficult to even extract meaning from your post. Too difficult to invest my time into it. It’s a bit like stream of consciousness literature; it seems like several (too many) statements and questions are hiding there, but I don’t think any of them are clearly and completely stated. If I were you I’d open a new topic, with a simpler OP, with just one or two clearly stated questions, then open more threads later as necessary.

The core of the issue here was already addressed by Benny, I think, the false premise of the idea of “explicitly define the return type of a function”. It’s just not possible to do this currently in Julia. (Although it might be kinda possible in the near future, I think, if method sealing/freezing gets public. It’d allow one to prevent more methods being added to a function. A PR: Allow freezing of `Core.MethodTables` by fatteneder · Pull Request #56143 · JuliaLang/julia · GitHub)

Sure, in the domains where you work with “white-box” objects, specifying the return type adds some safety against unintentionally returning a wrong object. But maybe just a run of a static analysis tool like JET.jl before commit would suffice.

Judging by your other posts and reasoning, you come from Python background? The OO paradigm there is more suitable for what you propose in this thread, as you can basically annotate not the concrete type but the supertype interface of which you care about. In Julia, interfaces are not tied to types and even to type hierarchy. An example is Tables.jl interface. Because of that, you cannot express satisfying that interface in the type domain (not that everyone is happy about it, you may find here many topics requesting formal interfaces).

2 Likes

To be fair, “prefer composition over inheritance” is quite influential as a guideline in the “object-oriented” world.

I meant that Python type hint might specify ABC, not the concrete object type, and a class may inherit from multiple ABCs (probably? not 100% sure).

1 Like

No, it’s not correct, because the vector does not contain AbstractString. It contains String.

I am making this statement from the point of view of considering what is written to the computer memory. You can’t write an abstract type to memory.

This is why I say it is incorrect. The function returns a Vector containing only Strings. It doesn’t return a Vector containing multiple types, the union of which is represented by AbstractString.

(Don’t worry too much about English semantics here.)


We could consider a simpler example.

v::Vector{Union{Int64, Float32}} = []
push!(v, 1)
push!(v, Float32(1.1))

What does the compiler does when it writes this data to memory? It must do two things:

  • Decide on a static size for Union{Int64,Float32} so that it can do indexing in O(1) time. (memory_index = sizeof_element * element_index)
  • Write some additional information (as part of each element of the vector) to track what the type of each element in each index of the vector is at runtime.

If you know a bit of OOP, most OOP languages do something similar where they write some information (usually a pointer address) which references a vtable. The vtable exists to figure out how to handle function calls on the object.

I don’t think Julia uses vtables, because it does multiple dispatch, not single dispatch.


All of this is to say that if I only write elements of the type Float32 to v, then it is a mistake of the programmer to specify the type of v as Union{Int64, Float32}}.

I know that was a bit of a tangent, but I think it’s useful to consider something about the structure of what the compiler actually writes into memory.

I was a C++ developer for a long time, then worked with Rust for a bit before gradually moving over to Python. I know a bit about the internals of CPython, but I would not say I am an expert by any means. I know more about Rust and C++.

I guess you are thinking something along the lines of using typeguard in Python to achieve a similar thing?

This is a good point. Perhaps annotating ::String as a return type makes little sense if you have an interface for it. ::AbstractString.

To clarify one thing - the intention was not to use annotated return types to get different dispatch, but to enforce some kind of correctness while getting documentation for free. Not sure if you interpreted my post as the former rather than the latter?

Ah. I’ve realized there’s a problem with what I have suggested here.

It’s the same example as before.

If you annotate with an abstract type, because you want to return an interface, then you degrade performance.

function myFunction()::AbstractString
    s::String = "hello world"
    return s
end

# my_s is an AbstractString, not a String, so performance
# will degrade
my_s = myFunction()

julia> typeof(my_s)
String

… what, what?!

function myFunction2()::Vector{AbstractString}
    v::Vector{String} = ["hello", "world"]
    return v
end

my_v = myFunction2()

julia> typeof(my_v)
Vector{AbstractString}

What is going on here? In the first example, it seems like the compiler is smart enough to figure out that it should compile myFunction() for a return type of String, which is a subtype of the “interface type” AbstractString.

In the second example, it isn’t smart enough to figure out that it should compile myFunction2 for a return type of Vector{String}, instead it returns Vector{AbstractString} along with all the performance penalties of this.

  • Why does this happen?
  • Is it because Vector is mutable, whereas String is not?

I guess the compiler cannot guarantee that after returning Vector{String} there is not going to be a line of code like

# MyStringType<:AbstractString
my_string::MyStringType = MyStringType("weird encoding")
push!(v, my_string)

which follows afterwards.

I’m fairly confident this is related

julia> String<:AbstractString
true

julia> Vector{String}<:Vector{AbstractString}
false

Simple because you said so, i.e., the signature required it to return Vector{AbstractString}. The compiler will not try to narrow the type based on anything that could have happened afterwards.

If you instead define myFunction2()::Vector{<:AbstractString} the return type will simply be Vector{String} as this is the concrete type matching the annotation. Again, the compiler does not care what you do to the value afterwards, e.g, push(my_v, @view "Ha"[1:1]).

1 Like

Ah yes, that’s making sense.

I tried this too, but for some reason this doesn’t compile?

function myFunction3()::Vector{T} where T<:AbstractString
    v::Vector{String}=["hi","there"]
    return v
end

julia> v3 = myFunction3()
ERROR: UndefVarError: `T` not defined in static parameter matching
Suggestion: run Test.detect_unbound_args to detect method arguments that do not fully constrain a type parameter.

Also, as you pointed out:

julia> Vector{String}<:Vector{<:AbstractString}
true