Semantics of :: in return type vs. argument type annotations

stevengj · February 8, 2024, 9:58pm

22 posts were split to a new topic: Why assignment operators return the right-hand-side

Benny · February 7, 2024, 9:34pm

It makes sense that you can’t. Passing something into A’s constructor is not semantically equivalent to converting that thing to type A. Ref(1) makes sense, but convert(Ref, 1) doesn’t.

For example, this implementation introduces a conversion bug. x::A = 1 works, but now consider a subsequent x::A = x. convert shouldn’t do anything if the input is already the target type, hence the fallback method convert(::Type{T}, x::T) where {T} = x in julia/base/Base.jl. But your method forwarding to the constructor causes a method ambiguity with that fallback method. The fallback method stopping you from forwarding all converts to the type constructor is a good thing; if it wasn’t there, x::A = x would’ve resulted in A(A(1)).

sadish-d · February 7, 2024, 9:48pm

Right, that makes sense. My point was just that a Base.convert method needs to be defined for user-defined types, which disadvantages them over primitive types. This was a counterargument that conversion of Char to Int is not the (only) thing that’s weird to me about ::, but that a convert is called to begin with.

stevengj · February 7, 2024, 11:06pm

“Primitive types” like Char and Int are implemented entirely in Julia (with a couple exceptions like Array), so in that sense they don’t have any advantage over user-defined types except that a lot of code has already been written for them. There is no special compiler support for Char, unlike e.g. char in C++.

mihalybaci · February 9, 2024, 4:24pm

sadish-d:

function foo(x::Integer)::Float64
	answer = pi * x

	return answer
end

But since I learned that it applies conversion, I do:

function foo(x::Integer)
	answer = pi * x

	return answer::Float64
end

This is a trick that I never thought about. Sometimes I use the first option, when really I just want an assert. I think the fact that it is possible to switch between a convert and an assert just by moving the type annotation is kind of slick.

An alternative way to create this struct is

struct B{T<:Int}
    b::T
end

Then it will error when presented with the wrong type

julia> B(2.3)
ERROR: MethodError: no method matching B(::Float64)

Closest candidates are:
  B(::T) where T<:Int64
   @ Main REPL[13]:2

sadish-d · February 9, 2024, 5:19pm

So question. I have the functions f and g:

function f(x::Int64)::Float64 return x end
function g(x::Int64) return Base.convert(Float64, x)::Float64 end

g is doing a conversion before returning. Is f doing that too? Because that would be wild.

f lowers to:

CodeInfo(
1 ─ %1 = Base.convert
│   %2 = Main.Float64
│   %3 = (%1)(%2, x)
│   %4 = Core.typeassert(%3, Main.Float64)
└──      return %4
)

g lowers to:

CodeInfo(
1 ─ %1 = Base.convert
│   %2 = Main.Float64
│   %3 = (%1)(%2, x)
│   %4 = Core.typeassert(%3, Main.Float64)
└──      return %4
)

mnemnion · February 9, 2024, 5:28pm

Yep. That’s why I don’t care for it!

I would like the type annotation to cause the third of these to throw an error.

julia> g(a)::Int = a
g (generic function with 1 method)

julia> g(12)
12

julia> g('a')
97

The way this does:

julia> f(a::Integer) = a
f (generic function with 1 method)

julia> f(1)
1

julia> f('a')
MethodError: no method matching f(::Char)

This being especially acute because there are no limits whatsoever on what an overload of convert does.

mihalybaci · February 9, 2024, 5:51pm

I suppose I would ask, is it necessary to make a breaking change when it is easy to produce the desired behavior with a simple change like

julia> g(a) = a::Int
g (generic function with 1 method)

julia> g('c')
ERROR: TypeError: in typeassert, expected Int64, got a value of type Char

?

Edit: Perhaps what is really necessary is just to update the manual with more examples on how to to get the expected return types in different situations?

mnemnion · February 9, 2024, 6:12pm

Necessary? No. I think it would improve the language, but the decision that core made is defensible, I don’t see it as obviously broken. I do have stronger feelings about Char being convertible to integer types, I would like to see that removed.

The change you describe isn’t simple in the general case, however. I have a function in the package I’m working on with nine return statements, all of the same type. As it happens, this is in one of the very many cases where the difference between a typeassert return signature and a convert return signature won’t be apparent, but annotating all eighteen return values with an assertion would be tedious.

One feature of function signatures is that one can read it, ignoring the body of the function, and learn something about the function. And to be fair to Julia, the current behavior is also informative, just a bit harder to follow, because the ::Type statements inside the parentheses mean something different from the one to the right of them.

rdavis120 · February 9, 2024, 6:14pm

I was hoping for something like this

function f(x) -> Float32
    if (rand() > 0.5)
        return x
    else
        return (x - 1)
    end
end

#  as an equivalent to the current syntax of adding a type assert on all possible return values
function f(x)
    if (rand() > 0.5)
        return x::Float32
    else
        return (x - 1)::Float32
    end
end

mnemnion · February 9, 2024, 6:24pm

That syntax wouldn’t work with closures, you’d have

(a) -> Float32  # This returns the type Float32
(a) -> Float32 -> a # This asserts a::Float32 and returns a

mikmoore · February 9, 2024, 6:30pm

@goto can do this with only modest boilerplate:

function f(x)
    if (rand() > 0.5)
        returnval = x
        @goto returnassert
    else
        returnval = (x - 1)
        @goto returnassert
    end
    @label returnassert
    return returnval::Float32
end

A skilled metaprogrammer could probably even automate this transformation via a macro. Or (maybe more simply) just have the macro move the “typeassert” from the function definition line to every return statement.

ffevotte · February 13, 2024, 9:20pm

An issue with return statements is that they are not always needed, and I don’t see how a macro could easily determine all possible return values of a function when not all of them are associated with an explicit return. For a macro-based approach, I’d rather use an approach in which a function definition like this:

@returnassert function foo(x, y) :: Float64
    if (rand() > 0.5)
        return x + y
    end

    # no explicit return here
    x - y
end

is transformed into something like this:

function _inner_foo(x, y)
    if (rand() > 0.5)
        return x + y
    end

    # no explicit return here
    x - y
end

foo(x,y) = _inner_foo(x,y) :: Float64

A proof-of-concept should not be too hard to write. The following implementation does not handle keyword arguments but should otherwise more or less get the job done:

PoC code

using MacroTools

macro returnassert(defun)
    inner = splitdef(defun)

    name   = inner[:name]
    args   = inner[:args]
    rtype  = inner[:rtype]

    inner[:name] = gensym(name)
    delete!(inner, :rtype)

    wrapper = Dict(
        :name => name,
        :args => args,
        :kwargs => inner[:kwargs],
    )

    wrapper[:body] = quote
        $(inner[:name])($(args...)) :: $rtype
    end

    quote
        $(combinedef(inner))
        $(combinedef(wrapper))
    end |> esc
end

julia> @returnassert function foo(x, y) :: Float64
           if (rand() > 0.5)
               return x + y
           end
       
           # no explicit return here
           x - y
       end
foo (generic function with 1 method)

julia> foo(1.0, 2)
3.0

julia> foo(1, 2)
ERROR: TypeError: in typeassert, expected Float64, got a value of type Int64
Stacktrace:
 [1] foo(x::Int64, y::Int64)
   @ Main ./REPL[2]:18
 [2] top-level scope
   @ REPL[6]:1

mnemnion · February 17, 2024, 10:11pm

Returning to this thread to report that I ran into a bug in the wild which an asserting return would have caught. I’m working on a VM, and the stack frame has a field which was originally UInt16. Later, I was doing some struct packing, and I realized there are some realistic if uncommon circumstances where that value might be exceeded, and besides, alignment meant I had 16 free bits which I may as well give to that field.

But I’d annotated the return value of one of the helper functions as ::UInt16, which, since return values are converting, raised no error. This was probably written before I knew that was the semantics, which I figured out before this thread but not by more than a week or three. The code was just truncating a UInt32 to UInt16, then widening it again when it was assigned to a field. It was only several weeks later that I wrote some benchmark code which actually exceeds typemax(UInt16) for that field, and got the InexactError from that function.

There’s no need to go over how to do it properly, after this thread I’m quite a bit more careful about the assert/convert distinction in my code. But this points to the disconnect I see, which is that a return value doesn’t have a shape, what has a shape is where it’s assigned. Refactoring is a classic case where I want the type system to assist me in changing code downstream of the change, and it’s too bad that return values convert, because it would have caught this immediately if it were an assertion.

I reckon most of what could be said here has been said already, I’m just adding a report from the field, and I hope that if and when 2.0 season comes around, the behavior of return value declarations might be revisited.

mkitti · February 17, 2024, 11:27pm

If there was no return type indicated, it seems that that the bug would not have occurred.

Frankly, I see neither the return type conversion or a type assertion within the function as the solution here. The reason I like neither is they both incur a runtime cost and do not do what I ultimately want, check the code.

Rather than using runtime features in a dynamic language, what it sounds like you actually want to check is type inferrence presumably from concrete input types. I perform this kind of analysis at test time or precompilation time.

# function declaration
f(x::UInt32) = x

# module evaluation time checks
let f_return_type = UInt16
    f(UInt32(9))::f_return_type
    Base.return_types(f, (UInt32,)) |>
        unique |>
        only == f_return_type  ||
        error("Unexpected return type")
end

Benny · February 17, 2024, 11:46pm

Well, it’s not this simple because a method can have multiple exit points marked by return in addition to the last expression. The convert+typeassert behavior taking up the function syntax’s return type slot is privileged to apply to all exit points. That said, it’s possible for a macro to transform a return type annotation to typeassert-annotations of all exit points.

That’s only if the return expression is guaranteed not to be UInt16 and the compiler can infer this. Otherwise the type check happens at runtime and the error depends on unlucky input values. Both convert+typeassert and typeassert-only annotations stop the return expression’s possible concrete types from leaving the method, getting in the way of tests like mkitti’s example.

Agreed that annotations don’t amount to static or compile-time tests and can incur runtime costs. But if the code was type-stable enough for an expression to be inferred as 1 concrete type, wouldn’t a typeassert be compiled away to either nothing or a guaranteed error at runtime?

mkitti · February 18, 2024, 12:11am

I find that return type assertions just hide problems when my code is actually type unstable.

julia> f(v::AbstractVector) = begin
           v[1]::UInt8
       end
f (generic function with 1 method)

julia> Base.return_types(f, (Vector{Union{UInt8,UInt16}},))
1-element Vector{Any}:
 UInt8

julia> f(Union{UInt8,UInt16}[UInt16(5)])
ERROR: TypeError: in typeassert, expected UInt8, got a value of type UInt16
Stacktrace:
 [1] f(v::Vector{Union{UInt16, UInt8}})
   @ Main ./REPL[24]:2

I would rather have the following and catch it from return type inferrence.

julia> f(v::AbstractVector) = begin
           v[1]
       end
f (generic function with 1 method)

julia> Base.return_types(f, (Vector{Union{UInt8,UInt16}},))
1-element Vector{Any}:
 Union{UInt16, UInt8}

mnemnion · February 18, 2024, 1:17am

Yes, that’s correct. Also, if the return type were an assertion, the bug would not have occurred. That’s the case I’m making here: the semantics chosen for return type declarations favor buggy code. Or just not using the feature at all, in which case, why have it?

If a return value is type asserted to be of a primitive type, only two things are possible: either it consistently is, or I’ll immediately see an error. If it’s the former, the check is elided. If it’s the latter, I fix the error, and now, the check is elided.

This is another reason to prefer the semantics of assertion for annotation of function return values.

It’s only a runtime feature because it converts! If it was an assertion, it would be a way to make sure that type-stable functions stay that way.

This isn’t about whether or not Julia is, or should be, a dynamic language. It’s about one facet of how the gradual type system works.

I have some tests in the suite which use reflection to check that properties hold for all subtypes of an abstract type. If there were some tool which offered a much more ergonomic way of doing the sort of thing you indicated, I would no doubt use that tool as well.

But let’s look at your code:

mkitti:

# function declaration
f(x::UInt32) = x

# module evaluation time checks
let f_return_type = UInt16
    f(UInt32(9))::f_return_type
    Base.return_types(f, (UInt32,)) |>
        unique |>
        only == f_return_type  ||
        error("Unexpected return type")
end

I say, for a primitive type, and given an assert semantics for return types, it would be valid for the compiler to use this check, and throw an error even if the function isn’t called, at load time. Do you disagree? I could be missing something here.

mkitti · February 18, 2024, 2:02am

Yes, I disagree.

You are confusing static analysis and compilation.

This is related to Julia’s form of dynamism. The just-in-time compilation from source to native code is intended to be a fast as possible. It is just meant to generate native code, not analyze your code.

We do not have a step in the execution model where are performing static analysis before compilation. If you want to static analysis, that’s up to you to figure out when we should do that and then to actually perform the analysis. We have some tooling that helps with this such as JET.jl.

Contrast this with Java, which does have a static analysis phase on conversion of source code to byte code. Like Julia, Java does fewer checks between byte code and native code compilation.

As I pointed out above, the flaw with this is the assumption that the prior code is type stable to begin with. If it is type unstable, then type assertions may help hide that instability.

Type assertions do not keep type stable code stable. They just make type unstable code appear more stable than it is, which can actually be useful. Subsequent code can assume stability.

If you want to keep type stable code stable, then you need to analyze for type stability. I think there are ways to do this with annotations that may be valid Julia, but could be interpreted by a static analyzer in a way that is not in the current Julia compilation model. Such a statically analyzed superset of Julia has been too readily dismissed in past discussions, but it is still on my mind.

For example we could use empty or identity macros as annotations. This is perfectly valid in normal Julia 1.0

julia> macro returns(ex) end
@returns (macro with 1 method)

julia> function f(x) @returns(Int64)
           x
       end
f (generic function with 1 method)

julia> macro noconvert() end
@noconvert (macro with 1 method)

julia> function f(x)::Int64 @noconvert()
           x
       end
f (generic function with 1 method)

A StaticJulia compiler could parse these annotations as assertions and ensure the compiled code is type stable and is not doing conversion.

julia> e = :(function f(x)::Int64 @noconvert()
           x
       end)
:(function f(x)::Int64
      #= REPL[22]:1 =#
      #= REPL[22]:1 =#
      #= REPL[22]:1 =# @noconvert
      #= REPL[22]:2 =#
      x
  end)

# Psuedo code steps:
# 1. Detect presence of @noconvert in function declaration
# 2. Get / lowered / typed / or compiled code
# 3. Confirm code is type stable and does not convert

This is perhaps similar to how mypy works for Python. It differs from the TypeScript approach because StaticJulia is just valid Julia.

Benny · February 18, 2024, 3:05am

Requiring function calls to see the error isn’t very immediate, less so if the return expression could not be inferred to exclude UInt16 so you still rely on running into bad inputs like the converting method did. Abstract type annotations don’t make for methods that can be immediately statically analyzed by the definition alone; mkitti’s example of a test for the inferred return type of a provided call signature is the bare minimum, and it could potentially be made more convenient than writing a separate test. I also wouldn’t consider a method that errors on bad inputs or errors on any inputs to be bug-free because that still needs to be fixed. You can’t rely on type analysis to catch all bugs either, fixing the right integer types but computing excessive values is just silent overflows.