Compiler work priorities

IMHO the compilation time can be annoying sometimes, but nothing you can’t get used to (I worked with MATLAB + CMEX that crashes MATLAB more often that I want, given my coding skils, making me waiting the 40s startup time…). However, by comparing the workflow I had with MATLAB, I really miss a debugger. Very complex algorithms like this would have been implemented and tested way faster if Julia had even a primitive kind of debugger.

5 Likes

A few common reasons:

  1. I often change struct definitions.
  2. I modify functions while making them more general in the types they accept. Then the dispatcher still favors the old, when I want to use the new.
  3. Modifying packages with lots of generated functions or @eval function definitions. I know you made some strong points against generated functions earlier, so maybe this is me being lazy – it’s often easier to write a generated function than to bet on the compiler optimizing some cruft away (which it normally does well). Elegant solutions take more creativity.
  4. General concern that things slipped through cracks, so that even if everything is working now, something will be broken when I restart. Hence, why I restart to (hopefully) confirm that this isn’t the case.
  5. Lack of awareness of Revise’s latest capabilities.

Some of these would also be resolved if I just wrote unit tests. Which I really should do.

11 Likes

Revise has helped me a lot when developing a larger Gtk application which requires roughly 2-3 minutes to precompile and start.

  • regarding Revise I have the feeling that Julia (or the REPL) gets pretty slow when Revise is active.

  • while Revise is nice for me during development is is really annoying that the users of my Gtk application need to wait for 20 seconds when they hit a button the first time. If there would be a way to “freeze” the current state and ship that to my users that would be awesome.

1 Like

Restarts: Segfaults, interrupting long running processes that don’t exit gracefully, and type redefinitions.

I’ve developed a type-heavy coding style for model formulation modularity. It seems like the logical conclusion to having multiple dispatch. But it means Revise doesn’t give me whole day sessions because altering types happens all the time.

3 Likes

That’s pretty much what Julia does itself for the REPL and Pkg to be fast (running code while the sysimg is being created). PackageCompiler.jl aims to do the same for user code.

I know but I have not got PackageCompiler working. Making PackageCompiler working reliable and integrating it into base Julia would IMHO be a very good path forward. What I am imagining is a Pkg command where you can “add” a package to a global cache file, which is then always compiled if Pkg manager recognizes that it needs to.

3 Likes

I completely missed Revise.jl and installed it just today. It’s been a few hours since I am using it, but I am already in love! Thank you so much for this gem!

2 Likes

My primary reason is due to struct changes. That happens quite often when building something new and the structs are still evolving. I initially worked around it by using a Dict or named tuple instead of a struct but that doesn’t work too well and eventually I will have to do it right.

Other than that, Revise is the greatest thing that I’ve ever used in any programming environment! Thanks, Tim, for your excellent contribution!

11 Likes

I also keep my session opened for like a week, but when I restart I often find bugs since I have all kind of stuff defined in the global scope.

For struct change I just rename them (search/replace), MyStruct → MyStruct2. Once you are at version 8 and your design is final you can rename to MyStruct and restart.

5 Likes

One thing that’s hinted at in this conversation but which should perhaps be stated explicitly is this: many compilation issues for Julia are not just a matter of implementing standard, well-known solutions—in a lot of ways we’re in a brave new world and finding solutions to compilation issues, given the ways that Julia works, is to a significant extent a series of research projects. A lot of it has to do with inventing and discovering usage patterns that work well and how compilation technologies fit into that. The new way that code loading and environments work seems to be pushing us towards the project environment as a compilation unit, but keep in mind that all of that stuff was brand new as of Julia 1.0, so we’re still learning and adapting.

26 Likes

One thing to maybe think about is that it might be possible to improve the latency-related user experience without actually speeding up the compiler – if a program can start doing stuff before its heavier imported modules are compiled (as long as it’s not yet using them, of course). That’s currently my use-case, FWIW (text-based interactive thing that uses JuMP – but only after the first user input).

I guess one could do stuff like that already, but some canonical mechanism, or even just some targeted documentation, might be helpful.

1 Like

In that line of thinking, I hope we see some work done to allow compilation to be done asynchronously in another Task, so that already-compiled (or interpreted) code can run in parallel.

I may give this a try once PARTR is merged and libuv is properly thread-safe.

5 Likes

You should consider different coding style with small functions and unit testing. For example all the derivatives should have their own functions which could be tested separately. At least in our development coding the derivatives are easy place to make mistakes. Check Chris Rackauckas YouTube video for an idea of the way of working.

4 Likes

That is an interesting comment. I think it is totally clear that the solution is lots of work and requires a very high technical understanding. But is there (from a CS point of view) really a research question in there? Or hasn’t e.g. Java basically developed solutions for minimization of latency in a JIT world? Don’t read that as a comment to adopt solutions from the Java world, I am just curious how much research is in there.

2 Likes

Some of it applies. We should probably work towards a tiered JIT system where code first executes in a slow but fast-to-generate form and upgrades to faster versions when available.

Many languages, including Java, have been designed with support for separate compilation of modules as a hard design requirement. We explicitly disregarded that when designing Julia because it seemed like too much of a stifling limitation. But that makes it hard to figure how to generate catchable units of compiled code. C++ code with heavy use of templates has a similar issue not coincidentally—it has many of the same benefits as Julia for similar reasons. The main difference is that C++ compile time is earlier. Figuring out this bit is one of the big challenges.

3 Likes

I’m extremely interested in the potential solutions to the latency issue, and I find all these informed opinions really valuable. So @StefanKarpinski, from your vantage viewpoint on the compiler internals, how hard to implement do you think this tiered JIT functionality might turn out to be? Are we talking about a major 3.0 redesign thing, or is the current compiler design extensible enough to integrate such functionality seamlessly (given enough man hours)?

Can you give some hints where this limitation is realized?

Composing generic code with specific types and generating fully optimized machine code.

2 Likes

Should be possible but it’s a fair amount of work.

I imagine it wouldn’t be breaking or necessarily 3.0-worthy (but probably on the 2.0 timeline) since we can already run julia in fully-interpreted mode with compile=no. From my perspective, we “just” need to implement the heuristics and mechanisms for switching between interpreting and different optimization levels, and then remove all the various assumptions that may be hard-coded in that assume that the optimization level can’t change during runtime.

2 Likes