Version 1.0 released of Nim Programming Language

Tamas_Papp · September 29, 2019, 8:27am

My understanding is that the key problem is that in general, proving something about types is difficult or infeasible, so for a large class of valid programs, even a very sophisticated compiler just can’t figure it out. Eg for

function fermat(a::Integer, b::Integer, c::Integer)
    # let's pretend there is no overflow
    if a > 0 && b > 0 && c > 0 && (a^3 + b^3 == c^3)
        "a fish"
    else
        42
    end
end

the result type is Int, but I am not sure very many compilers would figure this out (yes, this is a contrived example — but type inference can fail very easily in practice, too).

When this happens, traditionally “static” languages just consider your program invalid (not proven correct, it is considered incorrect). Instead, Julia will just try to work with it.

In general it is very difficult to prove that some code is incorrect in Julia in advance, because new methods, functions and global variables can be defined at any point between the definition and the AOT compilation.

ninjaaron · September 29, 2019, 9:34am

I see your point. In this case, I’m not against the compiler inferring Union{Int64,String}.

However, if the output of this function is sent to a function that only takes Int, I’d still prefer the compiler error, rather than allowing obviously invalid code to execute until it crashes.

It’s just a matter of where you put your value. I can well believe that researchers want more leniency in running unprovable programs. However, many of us have an interest in running Julia in scenarios where we’re willing to trade a little of the flexibility for reliability. I’d be willing to have your example fail if the compiler would prevent stuff like this:

julia> print_int(x::Int) = println(x)
print_int (generic function with 1 method)

julia> foreach(print_int, [1, 2, "oops!"])
1
2
ERROR: MethodError: no method matching print_int(::String)
Closest candidates are:
  print_int(::Int64) at REPL[2]:1
Stacktrace:
 [1] foreach(::typeof(print_int), ::Array{Any,1}) at ./abstractarray.jl:1920
 [2] top-level scope at REPL[3]:1

It may be that Julia is simply not the right language for scenarios where you want this kind of code not to compile, but it would be a shame, since Julia has so many other pleasant features that would make it useful in production environment where more aggressive type checking would be beneficial. Seems like something that could be switched on and off with compiler flags. (potentially)

Tamas_Papp · September 29, 2019, 9:56am

I am kind of confused by when you would want that to happen — since Julia’s compilation is AOT, is there a real difference between “runtime” and “compilation time” errors?

For a concrete example, consider (from a fresh REPL with nothing predefined):

f(x) = "a fish"
g(x) = sqrt(f(x)) # A

Do you want the compiler to error at A? The problem with that is that you are free to specialize or overwrite f at any point before you call g.

Or wait for

g(1) # B

? But then it is not really different from a runtime error.

(If I am missing something, please clarify with a concrete example if possible).

ninjaaron · September 29, 2019, 10:29am

I’m more thinking about these future static compilers for Julia that some of us are dreaming of. I agree that this is borderline nonsense in an interactive context.

I assume it would be possible to also do some type verification on interpreted Julia programs before they begin to run, but I think that should definitely be optional.

piever · September 29, 2019, 11:08am

Thanks for the very nice explanation and example!

I also agree with the “explicit return type by default” policy, as there are cases where it may be hard for the compiler to figure out the correct return type.

One example that turned out very tricky in Julia data analysis is row iteration of fully typed “large” tables. A “columnar dataset” (which is a collection of columns, each with their own type) can be parametrized to include the vector type of each column. Indexing (i.e. taking a row–a struct with a field per column) is type stable, but the compiler starts having difficulties working with these rows if the table gets large (many columns). I suspect this is one case where a : auto return type is a dangerous thing to do, as the width of the table may cause inference to fail. It is hard to predict exactly which tables fails and which work: I imagine it is an implementation details of the compiler.

jonathanBieler · September 29, 2019, 12:17pm

For the fermat example you can write a macro that will check that the inferred type is as specified. This will throw an error as you try to declare the function :

julia> @checked O -> O <: Integer function fermat(a::Integer, b::Integer, c::Integer)
           # let's pretend there is no overflow
           if a > 0 && b > 0 && c > 0 && (a^3 + b^3 == c^3)
               "a fish"
           else
               42
           end
       end
FAILURE: fermat(Integer,Integer,Integer) → Union{Int64, String}

This kind of approach doesn’t scale well though (as checking all subtype combinations quickly becomes prohibitive) and doesn’t work for future types, but it could be still useful for simple methods.

Tamas_Papp · September 29, 2019, 12:36pm

Isn’t this what we have

function fermat(a::Integer, b::Integer, c::Integer)::Integer
    ...

for?

I agree that it doesn’t scale well to write code like this, but generally you only need it in a few places.

jonathanBieler · September 29, 2019, 12:48pm

That just does a type assert on the return value at run time, so it will fail only when hitting a solution, while the macro above checks the inferred return type when you declare the function (the function is never run, only type inference). It’s more like automatic testing of the inferred type.

Tamas_Papp · September 29, 2019, 1:10pm

Given the clever compiler we have, it’s more than that: by ruling out the branch with String for return values, it effectively allows type inference to recover.

Sorry, I missed where this macro is defined, so I could not read the source.

anon56330260 · September 30, 2019, 5:00am

However, type of [1, 2, "oops!"] is Array{Any,1}, so type checking this is no possible (in any language).
If we change it into Union{Int64,String}, problem still exists, because this is still legal in Julia, if the array are full of Int64, then this code will be ok.
And the most serious problem is that, although you don’t define print_int on String, but you might define it later, so this program is still legal, until the time you finally decide to run this program(and you encounter a String).
Because function is the collection of methods, so even we invoke the same function multiple times, we might use different methods and get different results. I think this is actually forbidden in Haskell(unless we use higher order rank types):

constOne::forall a.a->Int
constOne a = 1

f::(a->Int)->Int
f g = (g 1) + (g 1.0)

the code of f above will not work in ordinary Haskell. We might somehow regard every function in Julia as functions with “higher order rank type”, this is not quite friendly for compiler to do type inference.

Moreover, Union{Int64,String} is not the same as those sum types in static language. For example, in Haskell or C , you have to wrap this in union or a new data type(while in Julia, we can’t use Union{String.Int} to materialize sum type, however we can wrap them in a struct). for example:

struct Either{A,B}
       e::Union{A,B}
end
a = Either{Int64,String}.([1,2,"opps"])
function printInt64OrString(a::Either{Int64,String})
    if isa(a.e,Int64)
        println(a.e)
    else
        error("encounter String")  #pattern matching not exhausted
    end
end

But I’am not sure this is what you want. I think wraping things in a struct and then forcing the function to accept this sum type just works like in C and Haskell(But still, since Julia has no pattern matching, I’am not sure how we can statically check whether pattern matching is exhausted) . Using some macros maybe can make things simpler.

ninjaaron · September 30, 2019, 9:40am

Yes, I get that. I’m thinking about a statically compiled context where you’re not going to have runtime eval. If you’re going to have a static compiler, you’re already in a situation where all methods must be known at compile time.

I’ve been thinking a lot about this recently for a toy language I’ve been hacking on (again, in a statically compiled context). It seems like a static compiler should be able to infer what cases the method handles (up to a certain point) and construct the wrapper implicitly. For non-exhaustive matching, an error would be raised at compile-time if an un-handled type was used. Course, I don’t really know what the limitations of Julia’s type inference are, so I don’t know how feasible that would be in Julia’s case. I understand that it’s more or less impossible with Juila’s current compilation strategy–and even that it’s undesirable for the majority of Julia’s workloads.

I just think it’s something to keep in mind as efforts begin to pick up around the static compilation work. Having a compiler that can construct union wrappers and do pattern matching under the covers would presumably also help cut down on size requirements of the bundled runtime.

Tamas_Papp · September 30, 2019, 10:25am

I am not sure that is a feasible design goal with Julia. With commonly used idioms chances are that in moderately complex code that takes inputs, you will have to compile some things on demand unless you are generating an excessive number of type combinations, but then you may end up not using 99% of them.

In other words, minimizing compilation may be a feasible option for some code, but totally eliminating it is orders of magnitude more difficult, if feasible.

ninjaaron · September 30, 2019, 11:37am

I mean, this is part of the approach they’ve been talking about using for years.

https://juliacomputing.com/blog/2016/02/09/static-julia.html

The plan seems very much to place restrictions on what types of Julia can and cannot be statically compiled, at least as for people who are interested in creating distributable binaries.

see also:

Tamas_Papp · September 30, 2019, 12:44pm

I am under the impression that, since that blogpost, emphasis has shifted from precompiling a restricted subset of Julia with no “escape hatch” (and thus not including the compiler machinery) to precompiling a lot of stuff (but not possibly everything) simply for startup speed, and/or making the compiler faster so that the cost is not that large. And the latter is still a very hard problem.

ninjaaron · September 30, 2019, 12:49pm

From conversations on Slack, I have the impression some people are still working on reasonably sized, distributable binaries, but I admit that I’m not sure how much momentum these efforts have.

StefanKarpinski · September 30, 2019, 3:12pm

Rich Hickey has an excellent analysis of sum vs union types in this talk:

I think the most damning point he makes about sum types is that if you start with a function that accepts values of type A and then you generalize it so that it can also accept values of type of type B As well, with sum types you have to change every caller because now the function actually doesn’t accept values of type A or B, it accepts wrapped values of those types. With a union this kind of generalization does not require changing any caller: if the function worked with A values before, it keeps working so there’s no need to change anything. Except he makes the point much better than I am.

mratsim · September 30, 2019, 10:50pm

That’s called the expression problem in the litterature.

The gist is that

if you use ADTs:
- Adding new functions is easy, you just need to have a switch or pattern matching on all existing types.
- Adding a new type is hard, you need to go and edit all existing functions.
if you use inheritance and methods:
- Adding a new type is easy, create it, implement all existing interfaces.
- Adding a new function is hard, you need to go and edit all existing types.

For both you have an escape hatch with “NotImplementedError” but the hard part is that you have to do lots of non-local changes instead of self-contained ones in the easy case.

For those interested in the expression problem in a high-performance computing and domain-specific language context, I have gathered plenty of approaches (from Scala, OCaml, Clojure, C++, Rust, Python, Haskell and PL theory) in preparation for the design of my deep learning compiler: https://github.com/mratsim/Arraymancer/issues/347#issuecomment-461009747

StefanKarpinski · September 30, 2019, 10:54pm

I’m familiar:

For what it’s worth, I think that problems with sum types are really missing the crux of the expression problem. If they were then using unions instead would solve the expression problem, which it does not.

yurivish · October 1, 2019, 2:29am

In the past I’ve been gently misled by functions like findfirst(predicate, A) which return an array index, or nothing if the predicate didn’t match.

The conceptual mistake on my part was to only consider the happy path, in which a matching element was found. It felt like this mistake was encouraged by the return type, as trying simple cases at the REPL returned a plain index type. If the return value was some form of Some, then it would strongly indicate that the None case must also be handled.

I wonder what you think about cases like this?

Tamas_Papp · October 1, 2019, 5:44am

Or, possibly, not reading the docs.

There is always a trade-off between safeguards and convenience, and both solutions have their advantages.

A wrapper like Some would be clearly preferable only when the sentinel value cannot be ruled out as a valid result — at this point it is used sparingly in Base.

Topic		Replies	Views
Julia v1.0-rc1 is now available Announcements release	60	6328	August 16, 2018
This month in Julia world - 2023-12 Newsletter	0	986	December 24, 2023
Julia v1.1.0-rc1, the first 1.1 release candidate, is now available Announcements release	17	4910	January 24, 2019
[ANN] Julia LLM Leaderboard - Help us make it more relevant for every day problems! Package Announcements announcement , generative-ai , prompting	22	3509	April 5, 2024
I'm Creating a 2D Platformer Using Julia! Community games	11	1420	December 29, 2023

Version 1.0 released of Nim Programming Language

Related topics