Enforcing function signatures by both argument & return types

One reason is that functions can have many methods, all with different type signatures. So it would be difficult to express all these signatures in a single simple type.

The real blocker though is that functions are mutable: they can have methods added and removed. So the type (which is an immutable property of a value) cannot depend on the method signatures (which are mutable).

2 Likes

However, the usecase here is expressing one signature: I would like my function to accept any function that has a method which looks like function(Int, Int)::Int. Why not associate types with methods? If you pass a function that has a method requested by the higher-order function, the type check succeeds.

A simple (and potentially dumb?) solution could be to equate function types and method types, as I showed above. When you’re calling a function, you’re really calling one of its methods, right? So when you’re passing a function as an argument, you’re essentially passing a bunch of methods that the callee can choose from. So have the compiler (or possibly runtime) interfere while calling the higher-order function and check whether all functions that are passed as arguments have methods that satisfy the criteria requested by the higher-order function. (BTW, this looks similar to what’s done while searching for the correct method anyway)


For example:

  1. The higher-order function is:
    my_map(fn::function(T)::U, data::AbstractVector{T}) where {T, U}
    
  2. One calls it like this: my_map(some_fn, some_data)
  3. The runtime searches through methods(some_fn) attempting to find a method that looks like function(T)::U where {T, U}
    • If it doesn’t find any, that’s an error which happens before my_map is called
    • If it does find such a method, it puts this method as the argument, so the call essentially becomes: my_map(find_appropriate_method(my_map.first_parameter, some_fn), some_data)

That should work, if I correctly understand methods as something similar to function overloads (as in, methods are the “actual functions” that are being called).

2 Likes

Even in your my_map example, there are things that aren’t really obvious how they should work.
my_map(fn::function(T)::U, data::AbstractVector{T}) where {T, U}
For example, what if data is a Vector{Union{Int, Nothing}} and fn doesn’t have a method fn(::Union{Int, Nothing})? One may suggest a kind of “union splitting” to require that methods fn(::Int) and fn(::Nothing) both exist… But maybe data only contains Int values, and fn with only fn(::Int) will work fine, so there is no need for a Nothing method.
Or, further, data is a Vector{Any} with elements of various types. Seems like there is no way to statically match your signature then.

Again the simplest solution would be to require T to be exactly the same type everywhere.

If data is Vector{Union{Int, Nothing}}, then T = Union{Int, Nothing}, so the function must be fn(::Union{Int, Nothing}). If there’s no such function - that’s an error. So, fn(::Int) accepts Union{Int, Nothing} will evaluate to false, just like AbstractVector{Int} <: AbstractVector{Union{Int, Float64}} is false, even though Int <: Union{Int, Float64} is true.

Similarly, Vector{Int} <: Vector{Any} is false, so, since functions and arrays are basically the same (you index arrays to get a value out and similarly call functions to get a return value), they could behave similarly in terms of types. Thus, fn(::Int) accepts Any is false (clearly, this function only accepts integers, not anything), and only fn(::Any) accepts Any should be true.

I’m no expert in type theory, though.

That would allow one to throw an error earlier in the code (as do type annotations). Do you see any other use of that?

The disadvantage of such annotations is that they restrict the code to things that are useful even at the development stage for debugging, for example units.

Yes, that’s one of the main ideas here: throw an error as early as possible. Imagine calling some function, compiling lots of intermediate code without errors, then running a bunch of that intermediate code (which could take quite some time!), only to error out on, say, the last line of that function, because the function you passed as an argument doesn’t have the necessary method. So, the function you called errored out and lost all its computations - because it’s impossible to spell out the type of a function in Julia.

Another usecase is to make the code easier to read because map(f::function(::Int), data::AbstractVector{Int}) immediately tells me what kind of function this function accepts. I think the usecases are the same as with any other types: you see what types a function accepts - and you immediately have an idea of how it can be used. Or maybe, how it can’t be used.

Personally, I like knowing what kind of arguments are accepted by functions. As soon as I see a function as an argument, I immediately lose track of what’s going on because I can’t tell what kind of function it expects.

Not sure whether I follow: limiting code to things that are useful seems great to me?

1 Like

Don’t get me wrong, the possibility of better error messages, compile-time errors, and (perhaps - because that can be solved with good documentation) more readable code is the advantage of static typing.

But in Julia that is not as natural, and not doing it has its own benefits, even for code quality. You can use all the Julia functions because most of them are type-generic. In terms of code debugging, for example, I have a package in which I was interested in 3D particles, of coordinates of represented by floats. Not even 32bit floats interested me. I was very used to type-typing everything (from Fortran). But then I found out that relaxing that allowed my code to run with 2D particles, coordinates with units, automatic differentiation, etc. Each of these things had a tremendous impact on what I can do to debug my code, which was for me much better than simply knowing that a variable got into the right place with the correct type, which is a much more simple error that the thing I can inspect by visualizing 2D representations of the system or computing automatic derivatives, propagating units, etc.

Concerning this specific situation, if you want to throw an error at the function call, for a function given by the user, you could wrap it in another function and assert the type of the input:

ulia> function f(g,x)
           function h(g,x)
               @assert x isa Integer "x must be integer"
               g(x)
           end
           h(g,x)
       end
f (generic function with 1 method)

julia> g(x) = x^2
g (generic function with 1 method)

julia> f(g,1)
1

julia> f(g,1.0)
ERROR: AssertionError: x must be integer
...

Not that this is better in general, but can be useful for the cases where you want to provide a better error message for a user that passes a function to your code.

one possible solution is to combine user provided type hinting (of signatures) in documentation with the results of various types of flow analysis.

this is in fact precisely how typescript is used commonly with vanilla javascript files to great effect:

all of this information is propagated to editor services, code completion, etc, so it tends to allow for better developer experience and more determinism at runtime.

the other advantage of such a gradual typing system is that the core language doesn’t necessarily have to change, so it leaves room for experimentation. also, because these systems are run incrementally, they can perform more sophisticated compute intensive analysis, since it’s always being done in the background lazily.

now, in terms of the “flow analysis” part of this equation, this is exactly what JET does:
https://aviatesk.github.io/JET.jl/stable/jetanalysis/
https://aviatesk.github.io/JET.jl/stable/internals/

As far as I know, there has been a lot of work in 1.7 and 1.8 to make the existing inference code in the compiler available to external tools like jet.

So the tooling provided by things like JET should help enable the development of this type of ecosystem. julia-vscode just got support for displaying results from static analyzers like JET, but I’m going to guess a lot could be done for a integrated solution that was aware of user provided type hints, external type declarations ala TypeScript, etc.

2 Likes

Sure - one could write a wrapper function and check the types manually. But here you’re doing the compiler’s job. Also, you checked that x is an Int, but didn’t check whether g indeed accepts one argument. You call it like g(x), but what if g doesn’t accept any arguments? How does one check the function’s number of arguments manually anyway? I have no idea - there’s probably a hack to do this as well, but that’s the compiler’s job, in my opinion.

As for better error messages - that’s what I’m doing currently. But frankly, that’s the compiler’s job too. I’d rather spell out the types and let the compiler check whether all calls are valid.

Actually yes, it would be fun to have typing experiments in the comments.Try out new features in the comments first, and then introduce then in the main language if they work well.

JET looks very promising indeed. TBH, I’d like to have a statically typed mode in Julia: run your code like julia --statically-typed my_code.jl - and that makes Julia type-check it.

I looked up this thread as I was curious about this as well. I think I have my own solution to this problem that I based off of C++'s typedef (can be found here: https://github.com/HyperSphereStudio/Machine_Learning/blob/main/Utilities/Fun.jl).

Its buried away in my deep learning library but if anyone wants I can put it in its own library.

Usage:

     
#Create the wrapper you want
@Fun(MyFuncName{T}, return_arg::T, arg1::Int, arg2::Int)

#Void version and no var names
@VFun(MyFuncName2{T}, Int, Int)

#The function can be called like a struct now
test(func::MyFuncName{Int})
       return func(1, 2)
end

#Wrap the anonymous function
test(MyFuncName{Int}((x1, x2) -> x1 * x2)

#Have it guess the types (careful with this though, better to manually type)
test(MyFuncName((x1, x2) -> Int(x1 * x2))

#string(MyFuncName{Int}) will return ".MyFuncName{Int}(arg1::Int, arg2::int):(return_arg::Int)"
#Docs will also display this aswell

I use it all over my library because where things accept functions all over the place. Without it, everything is super confusing on what functions accepts what / returns what

EDIT: New FunWrap Library can be found here: https://github.com/HyperSphereStudio/FunWrap.jl
(If you have any ideas to make it better, feel free to do a pull request and Ill review it)

3 Likes