Suggestion: introduce something like a @inferable or @typestable macro to the language

Hi,
I’m quite new to Julia and so far I can say that I fairly like the language. Especially, the generic programming ability in combination with multiple-dispatch is a great idea which I explicitly love. :yum:

That said, I find it a pity that Julia is still located in a niche and has not yet managed to evolve into one of the mainstream languages. Referring to the talk of Jeremy Howard (https://youtu.be/s6pjxCuNGjc) at JuliaCon, I support his opinion that one of the main drawbacks of the language is the difficulty of creating static executables which are small in size and comparable with C or Fortran binaries. I understand, that the dynamic typing system of Julia demands to ship the compiler code along with the actual binary (to allow JIT compilation) which is in my opinion the main difference to a statically typed language (as C or Fortran) (sure, you need to ship some extra code for heap memory management, but I guess this could probably be optimized to some KB in native code size; please correct me here if I’m wrong).

However, in a lot of cases, I think JIT is not necessary for the actual final executable as the ‘main’ function often infers stable types to the code which, maybe, can be propagate throughout the whole program. Obviously, to make this work, every subfunction called with stable types (and fixed parameters) must guarantee that the types of the returned objects are inferable by those input types and, thus, stable and known at compile time. (Maybe one could allow a few variants of return types which are opened up, e.g., through branches in the code, but this is already something advanced and may be ignored in a first iteration). I think @code_warntype already manages to check the inferability quite well and, hence, in my view the next logical step would be to add something like an @inferable macro to functions which checks that all types can be inferred at compile time (in all possible cases). If not, the code should throw an error and refuse to compile. As alternative or as a complement a @typestable macro could check on a case-by-case basis (of actually given, fixed input types) if the return types are inferable (or type-stable).

Having such macros would have in my view two advantages:

  1. Declaring julia_main as @inferable/@typestable would allow the therefrom generated ‘app’ to be completely compiled ahead of time, avoiding the need to include compiler code and, thus, making the Julia executables comparable to C/Fortran binaries. Theoretically, the PackageCompiler just needs to look if the macro is defined and could then optimize the code generation accordingly. Maybe one could even add a static linking option in the future to really just include the needed code (all external Julia libraries should theoretically also be available as static libraries?).
  2. Having an inferable typing should also avoid a lot of performance issues if ‘real’ dynamic typing (e.g. random numbers generating random parameters for random types) is not wanted in the first place and just introduced erroneously.

I don’t know how much of Julia’s standard library and also external libraries are already @inferable by nature. But I assume a lot of them since ‘real’ dynamic typing (as in the example above) does not follow the spirit of multiple-dispatch in my view (it’s probably called multiple-dispatch and not infinitely-dispatch for a reason :sweat_smile:).

I’m sure no expert of the language and I also don’t know if somebody else already proposed something similar (at least I did not find anything comparable on a quick search). There are maybe other ways to implement this (maybe through keywords or similar) or there is maybe something that renders my suggestion impossible to implement or useless for some other reason. But, if not, I’m glad if I could maybe inspire you to possible future improvements. I would gladly throw Python and Matlab over board and teach my students Julia from the first hour, but this is just possible if the language becomes more broadly accepted - and providing people with highly competitive (to C/Fortran) one-click solutions is probably one way to make this happen. :wink:

2 Likes

Look into packages like JET.jl and StaticCompiler.jl. Might also be interested in this standard library function, if you haven’t seen it already: Unit Testing · The Julia Language

1 Like

As implied in my first post, the awesome thing about Julia is that this discussion doesn’t belong in the “Internals & Design” section of the forum - user packages are enough to implement such features. This is one of the most amazing things that separates Julia from other languages as far as I see.

But note that efforts like these depend on the Julia implementation being good at inferring types, which is an ongoing effort that wont ever stop, I guess.

2 Likes

Thanks for the suggestion. I actually didn’t know about the @inferred macro. However, it seems that this macro isn’t doing what I have in mind: @inferred actually runs the code with an actual argument to see what comes out and compares it with the compiler analysis. What I suggest is that with a @inferable macro the compiler tells me if it is able to generate multiple-dispatch code for the function for any concrete input.

Note that it was well intended to post this in the design section since I believe that such a macro could have fundamental implication on how the code is treated by Julia (or the compiler) itself. Maybe it is even better to see it more like a keyword and not a macro (like ‘const’, just for functions). Having in the end something like:

inferable function foo(bar)
...
end

N.B. even StaticCompiler.jl (or any other code in this world) cannot do static compilation if one has a function that uses ‘real’ dynamic typing (which’s output has not a tightly limited number of possible types and, hence, would not get the attribute inferable - which would be identical to say that the function cannot be dispatched on concrete types)

1 Like

I don’t think the “for any concrete input” part of this makes sense. As far as I see that’s just impossible, because “for any concrete input” literally includes all possible types that may ever be written. The Julia compiler isn’t some oracle, magician or seer.

Perhaps I don’t understand what you want.

2 Likes

I think it’s not too complicated: if all subfunctions are ‘inferable’ by definition, also the actual function should be inferable if not one of the danger things happen (in the actual function) such as using runtime variables to define type parameters (which, e.g., should be easy to check for the compiler)…

Take a look at:

So you cannot possibly compile for all the types that can be passed to you functions, even if they are inferrable. To compile for a specific set of concrete types you can use StaticCompiler.jl. Yes, the tooling for what Jeremy Howard mentions is not there yet. But it is not because of this.

This talk by Tim Holy talks about the current state in some of those matters.

4 Likes

I think there is a fundamental misunderstanding here: this is all under the presumption that all input types are concretized (e.g. on top level such as in julia_main) with the aim to infer that it is fundamentally possible for the compiler to do ahead of time compilation for the given function.

1 Like

This is incorrect. A function call’s return type depends on the types of the function and all the arguments, which is dispatched to a method that recursively dispatches its callee methods. One function call could be type-stable while another call with different arguments is unstable, even if they dispatched to the same method. You really can’t say a method or function is inferable at definition, nor can you test the infinite number of input types and callee methods that will ever exist. You can argue that a method with all its arguments annotated with concrete types is just as specific as a function call, but that’s not how it works, the method doesn’t know an argument is restricted to 1 type. Besides, most methods should not be so restricted just for precompilation, that’s what function calls are for.

The best you could do is assert that a function call is inferrable in some precompile script and throw an error if it isn’t; we already have Test.@inferred for return types. I don’t know of a stricter version that asserts all the middle parts of the method are also inferable (for your proposed precompilation purpose, we could relax this to inferred Unions). Besides, Julia’s code inspection (including type inference macros) isn’t entirely perfect, so it’s still important to run and test the program, maybe with a tool that counts runtime dispatch and JIT compilation.

Julia’s upcoming effect analysis actually resembles your idea of proving properties of methods, though its intent is different. There are efforts to make low-overhead executables that may ditch the JIT, and when that becomes a viable tool, I expect some usage of a stricter version of Test.@inferred (if it doesn’t already exist, maybe it can have your proposed macro names) in the precompile script to rule out runtime dispatch before omitting the JIT.

1 Like

A function call’s return type may depend on runtime arguments. However, as already mentioned above, using runtime variables to specialize types is surely one case of dynamic typing which prevents inferring the return type. Hence, such a function would then also never get the ‘inferable’ attribute. And such a function would logically also prevent ahead of time specialization of the code. Consequently, the performance of the generated code is probably also worse than what it could be. Which brings me back, why I proposed such a keyword in the first place. As I see it (don’t hit me too hard for this statement), producing code where types are not inferable by compile-time knowledge should be considered bad programming habit (in most ‘standard’ cases) which Julia indeed allows but which should also be strongly discouraged (in those cases).

I admit that checking type inferability on a general basis (for a not limited set of input types) by the compiler might be trickier as portrayed above (think e.g. on a loop where types of function arguments happily alternate between iterations). However, this is also not what is really needed to accomplish what I have in mind. Type inference can also be done (ahead of time) on a per case basis (on actual input types) as proposed at the beginning (with the ‘typestable’ keyword/macro).

Thus, after one night of reflection I propose the following:
-The user can set the ‘typestable’ keyword on a function to declare the intent that the written code is meant to be type stable.
-If the compiler meets an instance of this function (with actual argument types), it should try to infer the actual output type. If there is more than one possible output type (or the set of possible output types is larger than what the compiler sees fit to handle through multiple-dispatch) an error should be raised indicating the problem.
Having the compiler infer the types case-by-case is in line with what @code_warntype already accomplishes and, hence, should be quite straightforward to realize. The benefit would be then that (1) the code is not used in a way that is not intended by the programmer and that (2) the ability of ahead of time compilation can be guaranteed for statically typed calls to typestable functions.

N.B., as already mentioned above, I think the Test.@inferred macro does something fairly different as it actually calls the function with given runtime arguments to determine and compare the output type.

1 Like

You’re right that this is dynamic typing, but you’re wrong about it preventing inference of return types. A function call triggers compilation if no compiled method exists, and the types of the “runtime variables” are used for inference and compilation. For example, a method f(x) = g(x) has no annotated types. Runtime values and a function call triggers compilation: y = 3; f(y) . To put it one way, the type of the runtime variable y is known during the compilation of f(y). It sounds unusual that compile-times happen in the middle of runtime, but that’s why it’s called JIT compilation instead of AOT compilation (though Julia’s JIT also doesn’t work like many other JITs).

I think you’ve confused the term with “dynamic dispatch”, which is also called “runtime dispatch.” This is when the type of a variable inside a method body couldn’t be inferred, so all inner function calls that take that variable as an argument cannot be dispatched and compiled during the outer method’s compilation. So dynamic dispatch doesn’t cause failed type inference, it’s the other way around. While it is technically possible to compile in advance everything that might be runtime dispatched, it’s almost impossible to guarantee in practice, hence the need to keep the JIT compiler around.

Since you’re intending this for AOT compiled, small, fixed executables that lack a JIT compiler, why even annotate functions one by one? Shouldn’t ALL function calls in the program be inferable and lack the dynamic dispatch that may trigger the JIT compiler? It just sounds like you need an executable-compiling method that checks a precompile script for 0 runtime dispatches.

Good catch, I suppose you wouldn’t want to actually run a potentially long function. Instead the inferred return type could just be checked for concrete-ness. Also I should reiterate that only checking the return type of an explicit function call isn’t good enough. For example, I could write a method with almost no inferrable variables (very red @code_warntype, runtime dispatches everywhere), but then convert to a specific return type at the end. Even a @code_warntype with all-blue colors isn’t good enough, some of those inner function calls could be such convert methods. So I suppose every compiled method has to be checked, whether it’s an explicit function call like f(y) or an inner one like g(x).

See GitHub - brenhinkeller/StaticTools.jl: Enabling StaticCompiler.jl-based compilation of (some) Julia code to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things!

When I am speaking of inference in this context, I always meant inferability by the input types (and not runtime values of the variables). Sorry if I was not concise enough. I understand that Julia is a JIT-capable language which also uses this feature extensively. This is totally fine, e.g., in the REPL context as this is what the user expects from it. However, what I want the emphasize here is, that, even if Julia is JIT-capable, does not mean it has to be AOT uncapable (or incompatible). Your example would work perfectly well with AOT as long the type of y is known at compile time (e.g., when defined in a julia_main function) and the function g(x) is typestable. If function g(x) does something like:

g(x)=foo{x}(0)

then g(x) is obviously not typestable meaning the return type cannot be inferred by knowledge of the input type alone (but actually depends on the value). So I think we have to distinguish between inference based on types and based on values. Both inference cases are obviously resolvable at runtime but the latter one is pathologic for AOT compilation efforts (in the REPL application it just cripples your performance). You see that, even in your simple example, one cannot a priori assume that f(x) is suitable for AOT compilation. However, if we would have defined the interface of g(x) as, .e.g., typestable g(x) the user could rely on the fact that AOT compilation is possible without having further knowledge of the content of g(x).

Yes, obviously all functions should lack dynamic dispatch if AOT compilation is desired. However, I think in the sense of encapsulation and the black box principle it would be very helpful if a function would have an attribute in its interface which promises that (just like ‘typestable’). This would help in my view the compiler (as it can make assumptions), the programmer (as it may detect erroneous code) and the user (as it can rely on something). It is just like combining a passiv effect analysis (of which you were talking) with an active intent (or statement) of the programmer to assure more coding safety.

1 Like

I think there’s a small part missing here that I hope I can fill.

To start with, “type stable” is never purely a property of a function alone. You will always need to know at least the type of its arguments to try to determine whether the return type of a function depends on the types of the input arguments alone or needs some more information. Take this example:

struct Foo
   vals::Array{Int}
end

struct Bar
   vals::Vector{Int}
end

function foo(v)
   v.vals
end

This function is type stable if we put in a value of type Bar, but not if we put in a value of type Foo. So adding any kind of inferrable mark on the function alone is not sufficient - it depends on the type of the object we want to put in as well. A user can’t do anything with such an annotation and it can’t help the compiler, because it depends on what the user decides to put into foo that determines whether foo is type stable or not. Type stability is not an inherent property of functions alone.

AOT compilation is just an extension of that concept. By propagating type information from the top level down as far as we can, we can (possibly) compile all code we need to actually run our function before we do our first call. That’s useful of course, but happens already without any additional function-level annotations.

3 Likes

There’s no difference. Do you think that “input types” meant the input type annotations? Type annotations in method definitions are just restrictions on input types in method calls, they’re not the same. Maybe it’ll help if I comment on a code example:

#     input type annotation
#     V
h(x::Integer) = 2*x

# input with type Int
# V
h(3) # triggers compilation of h(::Int)
     #, including type inference

The h(3) could be in a precompile script, just specifying the function h and the input types Int.

Actually, if you made a seemingly cosmetic change g(x::T) where T = Foo{x}(0), calls like g(Int) and g(Float) would be type-stable and compile separately; this is a quirk of types as inputs. Incidentally, this seems to be an example where @code_warntype seems to overestimate how type-stable these calls are; even without the change, g(Int) is all blue. Maybe this type instability is optimized some other way? In any case, I used MethodAnalysis.methodinstances to see when changing the input compiles separately.

That’s correct for any method definition. It’s possible to precompile a method call f(3) without having to run it, though my understanding is this has yet to be implemented because the compiler needs to be improved in some ways before this would be worth it. You wouldn’t need the runtime values either, you just need the method definitions f and the types (typeof(f), Int) to be created. This would effectively be AOT compilation.

Except you said this would throw an error when encountering uninferred types during compilation; that’s the exact opposite of relying on inferrability. Besides, it doesn’t allow us to AOT compile anything because we don’t have input types. And if we provide input types so AOT compilation is possible, we wouldn’t need the typestable tag at all; we’d just attempt the AOT compilation and throw an error if we reach uninferred variables.

In my interpretation of type stability your example function is perfectly type stable: you enter a different type, you get a different type out - you enter the same type, you get the same type out - the output type is inferable by the input type; the compiler should be perfectly capable of doing AOT if the input type is known at compile time. The question if the type is still abstract or not is another (note that you put an abstract type in and, thus, you get an abstract type out).

1 Like

Run the following after Sukera’s example.

x = Foo(zeros(1))
y = Foo(zeros(1, 2))
typeof(x), typeof(y)
typeof(foo(x)), typeof(foo(y))
@code_warntype foo(x)
@code_warntype foo(y)

The abstract types are the annotations, which are just restrictions. No input/runtime value has an abstract type.

1 Like

No, I do not mean that. If you run the code the compiler will figure out actual types for all generic variables you define apart from type annotations or constraints (e.g. that 3 is an int64 in our example) - this compiler known types are what I mean when I speak of ‘input types’. Usually, in top-level functions (or ‘programs’) all types are fixed at compile time (e.g. in a julia_main function) and they are passed down to callees as ‘input types’. If the callee is now type stable (meaning that all types of all variables and the return of the callee can be inferred by the input type) AOT of this call should be possible.

1 Like

These are not the same thing and only your last comment (“the output type is inferrable by the input type”) comes close to what “type stable” really means. A function is type stable exactly if the output type is uniquely determined by the input type(s). If you get a Union or an abstract type or some other non-concrete type out, your output type is no longer uniquely determined because it can be any number of concrete types, i.e. types an actual object can have.

I do not put an abstract type in - Foo is a perfectly fine concrete type:

julia> struct Foo
         vals::Array{Int}
       end

julia> isconcretetype(Foo)
true

It just happens to have an abstractly typed field (well, more accurately it has a non-concretely typed field - Array{Int} is not an abstract type after all), which means accessing this field will lead to a type instability in the function that has such an access. This is dependent on us actually passing in an object of type Foo and is not inherent to the foo function. That the actual object stored in Foo has a different concrete type is irrelevant - type inference and the compiler have no knowledge of that true type, since it is only known at runtime and they can only work with information available at compile time.

2 Likes

I think it’s worth emphasizing that the “abstractly typed field” refers to the definition annotation. If you check typeof(y.vals) you get a concrete type. So a good question would be: if compilation happens at foo(y) after y already has a value, why isn’t the concrete type of y.vals inferred in foo(y)?

The answer: x.vals and y.vals have different concrete types, but x and y are still both Foo instances. The compiled method foo(::Foo) must work for both foo(x) and foo(y), so .vals cannot be more specifically inferred. Types during compilation come from runtime values, but the types of the runtime values restrict what type information makes it into compile-time.

1 Like