Function Parameter Speculation #17168

Think I’m caught up on this point as it’s quite different from the type systems I’ve encountered. Say we have an anonymous function, (x::Int, y::Int) -> 1 + x + y. Surely the type of this is Tuple{Int, Int}-->Int? Then, the method f(x::Int, y::Int) = 1 + x + y would presumably have the same type?

I understand that multiple dispatch associates methods with a function object, but including the function object in the method type would seem to create problems for higher order functions that specify a type like (x::Int, y::Int)-->Int as logically the function object is irrelevant here.

The anonymous function still has an identity, it’s just harder to write since it wasn’t given a name in the source code. So we have to make up some sort of name expression:

{ "typeof((x::Int, y::Int) -> 1 + x + y)", "Int", "Int" } --> "Int"

would seem to create problems for higher order functions

Yep, that’s why it’s important that they be include in the model of system, as otherwise this isn’t actually a discussion about Julia’s type, function, and dispatch systems. :slight_smile:

I suppose one could just write { "Any", "Int", "Int" } --> "Int", which would make the object-oriented programmers happy at the expense of some ugliness for functional programmers. But that’s just an aesthetic concern (assuming that it’s largely the FP crowd using HOF, and that they always would have Any as first in the tuple), so could use syntactic sugar and render it a moot point.

For clarity, my_noise does run fine in julia right now if I hadn’t accidentally slipped in haskell function syntax: add_noise(y) = map(liftE(x -> x + noise()), y)

If I understand what you’re saying, I think this can be solved by running the type rules “backwards” in the spirit of Hindley-Milner. This may sound like a contradiction as Julia is a dynamic language, and indeed it would be if done globally. But I conceive it is necessary if methods are to be first-class values in Julia.

So I wonder if it would be possible for Julia to do method tagging / “local type inference” along the lines of Rust rather than Haskell, where adding a tag to add_noise would run method dispatch in reverse for the purpose of compiling & type checking the method. This would not change type inference anywhere else. And certainly, there would be cases where inference could not solve ambiguous cases, which could be addressed by adding type tags.

If you write “Any” as the first element, that expression now seems to mean “all functions of two Int arguments return an Int”?

This post is a little long so I’ve split it into three parts. I’m not too familiar with any of this, so please correct me if something is wrong.

Method Types are Possible

With regard to the current conversation:
Can you write the type of a method independently of the function it’s a method of? I think the answer is no. A method takes its function identity as its first parameter so it must be in the type.

But that’s not necessarily a problem. These things are supposed to be covariant on the return side and contravariant on the argument side. That is, (A --> B) <: (C --> D) iff A >: C and B <: D. Note that this is also the convention used in functional languages, including Scala and Haskell.

That means that the only sensible interpretation of method::({X} --> T) is:

  • A guarantee that the method applies to signature {X}; that is, if x isa X, then x() should be able to dispatch to this method (I personally think this is a little silly, but see below.)
  • A guarantee that the return value of this method (if it returns at all) is an T

The first guarantee is a little vague, but from the perspective of the method itself, it’s just a subtyping check to see if the argument type is matched.

That means that as @jameson noted, {Any, Int, Int} --> Int is wrong for a method of an anonymous function (because this would imply the method were dispatchable for, e.g., "x"(1, 2), which is false). But this is not really a problem, because {Union{}, Int, Int} --> Int describes these methods fine. The only trouble theoretically is that there are no objects of type Tuple{Union{}, Int, Int}, so we’re relying on the fact that this is not considered just Union{}, but is instead some hypothetical type. It means our definition above is not strictly correct, and we have to add a special case for tuples with Union{} as a parameter. But overall there is no problem here.


Function Types are Technically Infeasible

A more significant problem is function types. These are inherently a lot harder than method types, and just taking the Union of method types does not really work. (It’s not at all what you want; a Union means this object is of one of these types, and it does not make sense for a typeof(f) to be a Union of multiple types.) If

f(x::Int) = 0
f(x::String) = 1

then we note that the function f, in general, satisfies that it can dispatch to f(::Int), and the return type is Int, so we might want f::({typeof(f), Int} --> Int). Furthermore it can dispatch to f(::String), and the return type is String, so we might want f::({typeof(f), String} --> String). This means that the subtype relationships we want to have are:

  • typeof(f) <: ({typeof(f), Int} --> Int)
  • typeof(f) <: ({typeof(f), String} --> String)

in other words, we want

  • typeof(f) <: typeintersect({typeof(f), Int} --> Int, typeof(f), String} --> String)

and furthermore we want typeintersect to return a type that actually is contained by both of the supertypes. As far as I can tell, this is already infeasible. This would require an Intersect{F, G} alongside the currently existing Union{F, G}. The type lattice is not really built to support Intersect{F, G} as a type. This would also make a lot of problems like subtyping very hard or possibly undecidable.

Another possible approach is to use {F, G} (i.e. Tuple{F, G}) to simulate intersection. This also has a lot of problems, as you might imagine, and is not very helpful for the programmer.


Function Arrow Types are Unnecessary

Overall I’m very convinced that true arrow types are not possible, because the behavior of functions is too complex to describe in a type system. In practice, arrow types are useful for two reasons:

  • to allow compiler/user to check the correctness of an argument, or to dispatch on arguments
  • for performance reasons: so that a function pointer can be stored in a type stable way

For the second point we don’t need true arrow types at all. It’s totally fine to have a “arrow type” that is a callable type with a type assertion which wraps around a function pointer and ccalls it; I believe that @yuyichao has worked on this.

For the first point I’m unconvinced of the usefulness of arrow types. I am pretty sure that dispatching on arrow types will never work well, from both a technical standpoint and a user usability standpoint; for the behavior of dispatch to depend on implementation details of inference seems crazy to me. Theoretically, this could be used to provide optimizations when inference can figure something out, but Core.Inference.return_type can already be used to do that in cases where it matters.

Furthermore, I’m not really sure what the point of arrow types is in a language where any function call will compile, and there is no real difference (from the programmer’s perspective) between a MethodError and an ArgumentError. In languages with arrow types, MethodErrors are found at compile time, whereas ArgumentErrors are found at runtime, so it is meaningful to use types to describe that certain things will pass compile.

Function types in Scala and Haskell guarantee that if f::({typeof(f), Int} --> Int), then f(1)::Int will not fail compilation. (There is no guarantee that f(1)::Int will not have, for example, a runtime error.) If we use the same definition in Julia, then every object has this type: there is no object for which f(1)::Int will fail compilation. So if we go by the same standard, then {typeof(f), Int} --> Int is just Any.

The argument is often made that MethodErrors are different than ArgumentErrors, even though they’re both runtime, because the former indicates a likely mistake by the programmer, whereas the latter may be more likely due to exceptional input as opposed to programmer error. But this is really a pointless distinction; if I define f(::Int) = error("don't use this on Ints"), why should all of a sudden f become isa {typeof(f), Int} --> Int? To me this does not make much sense. Sure, f accepts an Int argument and always returns an Int if it returns at all… but this is really the same as f not being defined to accept an Int argument at all.

The issue, really, is that the idea of a function “accepting” a certain signature is not well defined. Should f be considered to “accept” signature S if there’s no method error? What if there is sometimes ambiguities in certain exotic cases?

Overall I’m of the opinion that arrow types are an incomplete way of detecting whether the programmer is likely to have made an error. There is no replacement for good unit tests in catching typos. A linter can help too for missing coverage. But I don’t see type checking functions as being sensible here.

2 Likes

typeintersect to return a type that actually contains both of the subtypes

This is actually not too hard (the unique answer is Bottom / Union{}), but that’s also not very helpful. In Hindley-Milner, I think that is the desired answer (this would then terminate inference with a failure). But the property you’re describing sounds more like Union{F, G}?

But I don’t see type checking functions as being sensible here

I think you’re getting into religious territory there :slight_smile: . But I agree, and think you illustrated the problem well.

But overall there is no problem here.

I think your initial intuition about there being a practical problem here was right. As you noted, the tuple type there is uninhabited. So it seems to give the correct but fairly useless statement “the type of a method that will never be accessed/called may be assumed to be Int”. So it doesn’t have a theoretic problem.

1 Like

Sorry, it’s a typo. What’s needed is for typeof(f) <: typeintersect(A, B) <: A and typeof(f) <: typeintersect(A, B) <: B, which is harder than just returning Bottom :wink:, and probably very difficult in general.

I understand that Function cannot be a “first-class function” in the traditional use of the phrase, but hope that methods can be.

If true, this suggests that methods are not true values either, as f = x::Int -> 2*x and g = x::Int -> 2*x have different types, while a = f(2) and b = g(2) have the same type. Among other things, this would mean that all higher order functions e.g. function composition would be an expression rather than a value, hurting performance and safety :frowning:

Edit: so what I’m pushing for is for methods to have type outside of their function, or at least have some construct in the language that is–perhaps lambdas are a good candidate as multiple dispatch doesn’t make sense there anyway?

Maybe I’m misunderstanding your message—please correct me if so.

But higher order functions like function composition are values, as are all functions. The results of function composition are also values. In fact, the type of closures is well-defined and quite sensible. You can check, for example, that sin ∘ cos === sin ∘ cos. These are the same objects and have the same type.

You’re right that f = x::Int -> 2x and g = x::Int -> 2x have different types, but that’s inevitable, because they have different signatures. But that is just an implementation detail of ->. It’s possible that in a future version they will have the same type.

If you must have x -> 2x and x -> 3x have the same type, an easy implementation is just

multby(n) = x -> n*x

and then multby(2) and multby(3) will have the same type. This kind of optimization will probably not be automatically done in the future because it will require more data to be passed whenever such a function is called, and it prevents some of the optimizations possible by LLVM.

They may be values in the trivial sense, but I don’t believe we can currently write a law of composition on functions in Julia. Most other values in Julia are a Group if not stronger.

Formally, a law of composition on a set S is a function of two variables, S x S -> S.

Let a and b be arbitrary Int64. Then a + b is Int64. e.g. a=1; b=2; c = a + b; typeof(a)==typeof(b); typeof(a)==typeof(c). + is a law of composition on Int64.

Since typeof(sin)==typeof(cos) is False, we cannot satisfy the law of composition with . Even if we use sin as both arguments, we see that typeof(sin)==typeof(sin ∘ sin) is False.

As it currently stands, all other (“first-class”) values are eagerly evaluated and allow for a law of composition, while higher order functions are lazily evaluated and cannot be a law of composition.

That would be a huge step in the right direction!

Not all sets are types. The fact that sin and sin ∘ sin have different types is important for optimization: because they have different signatures, a higher order function taking sin as an argument should generate more optimized and different code than taking sin ∘ sin as an argument. There is really nothing stopping you from defining your own group S with any collection of functions including the identity, and their compositions. This group will not have a type in the type system, but the goal of the type system is not to express all possible groups.

There is no lazy evaluation involved with higher order functions. In fact, there is no difference between a higher order function and a first-order function, and there is no difference between these and arbitrary values — in Julia, both functions and plain values are first-class. Can you explain what you mean here?

Note that is a law of composition on Function; it is always the case that (f::Function ∘ g::Function)::Function. The only difference between Function and Int64 is that Function is abstract.

It sounds like what you’re looking for is to be able to do composition, and be able to interchange results of such composition, in a performant and type stable way. This is a difficult problem in general, but a technique called “embedding” applies here. Note that I will ignore performance for simplicity; work done by Yichao Yu and other members of the core dev team is in place to make this kind of thing more performant.

The idea is to embed a particular space of functions into a single type. This enables the property that all objects inside this embedding have the same concrete type, which seems to be what you’re looking for.

Following is an example.

import Base: convert, show, ∘
struct Hom{D, R} <: Function
    f::Any  # performance is ignored here; this can be improved
end

show(io::IO, φ::Hom) = print(io, φ.f, " as ", typeof(φ))
show(io::IO, ::MIME"text/plain", φ::Hom) = print(io, φ.f, " as ", typeof(φ))
((φ::Hom{D, R})(x::D)::R) where {D, R} = φ.f(x)
convert(::Type{T}, f) where {T <: Hom} = T(f)
(φ::Hom{D, R} ∘ σ::Hom{R, W}) where {D, R, W} = Hom{D, W}(φ.f ∘ σ.f)
const sine = Hom{Number, Number}(sin)
const cosine = Hom{Number, Number}(cosine)

Then you can indeed have (sine ∘ cosine)::Hom{Number, Number}. We’ve embedded a variety of other types into the set Hom{D, R} that we’ve exhibited as a type.

2 Likes

If 1+"hi" is a type error, so is sin ∘ string

Yes, exactly!

Indeed, that is precisely what I’m looking for–function composition on concrete types.

This is very exciting! Would love to see methods automatically join the relevant Hom sets, as this would allow for strong typing of, say, the lt parameter in sort.

No, this doesn’t follow. sin ∘ string is a perfectly valid function: it applies string, then sin. It happens to be the case that the function will throw an error when called on any argument, but that’s allowed in Julia. The value "x" also throws an error when called on any argument: ("x")(...) is always a method error. That doesn’t mean it should be impossible to produce the value "x".

Another example of how functions are not first-class values: all composition is weakly typed. I’m sure this is for performance and simplicity reasons that are well-calibrated. But if we can choose, sure would be nice for function composition to be strongly typed, just like integer composition.

I’m probably too much on the FP side to understand why we would want to permit (“x”)(…).

There is no weak typing here. sin ∘ string means a function such that for all xs..., (sin ∘ string)(xs...) is the same as sin(string(xs...). That’s well-defined and meaningful.

We don’t want to permit ("x")(...). That throws an error. Similarly, we also don’t permit (sin ∘ string)(1), which also throws an error. But we permit "x", because there is no reason not to, and it might be useful for other purposes besides being called. Similarly, we permit sin ∘ string, because there is no reason not to, and it might be useful for other purposes besides being called.

function ∘{A,B,C}(f::Tuple{typeof(f),B} --> C,
                  g::Tuple{typeof(g),A} --> B)
             --> ( A --> C)
    x->f(g(x))
end

Sorry should have clarified what I mean by compose.

Yes, this is understood. Indeed, aside from the types you want not actually existing, the current definition of satisfies this. sin ∘ string, and any function for that matter, can be considered to be composing Union{} --> Union{} with Union{} --> Union{}. So I don’t understand why it should be an error in any event.

Julia only gives “type errors” when you evaluate the code that exhibits the problem. The reason 1 + "hi" throws immediately is because when you evaluate that, it’s already a problem. When you evaluate sin ∘ string there’s no problem – the problem only occurs when you actually try to call that composite function on something. And the way Julia works, maybe it doesn’t – this is a perfectly valid if questionably useful program:

julia> ss = sin ∘ string
(::#55) (generic function with 1 method)

julia> Base.string(n::Int) = n

julia> ss(123)
-0.45990349068959124

You can’t generally know if a function composition is “well typed” when it occurs. You can only know that when it runs. We could potentially have a subset of behaviors which, if you follow them, allow compile-time checking that a problem doesn’t have any method errors, but that’s a project for some time in the future (post-1.0).

4 Likes

Glad to hear it’s on the radar for post-1.0. A natural candidate for such a subset seems to be ->. Would be great to see this as a value as fleshed out in this thread. I think this is what @fengyang.wang and @jameson referred to as an “arrow type” (as opposed to the Hughes arrow).