Compiler work priorities


#1

In response to

I thought I might write a little post about the rough priorities of the compiler team:

  1. Correctness
    • finding and fixing compiler and inference bugs
  2. Multithreading
  3. Compile-time latency, aka “the time-to-first-plot problem”
    • making compilation faster
    • caching more things
  4. Compiler-related packages and tools

Julia 1.0 introduced a new compiler and there were inevitably various bugs and issues, so the first order of business is fixing those.

Multithreading is the next highest priority because we want to merge the new parallel runtime as soon as possible and allow people to start writing and using threaded package code before things get too far along. Ecosystems for single-threaded languages tend to start to bake that assumption in very deep if they’re around for too long. Julia’s package ecosystem is small enough and young enough that if we introduce proper multithreading soon, it will adapt quickly.

Compile-time latency is clearly a huge pain point for all Julia users, so that’s quite important.


State of the Debugger
Using a C++ library with Julia 1.*.*
Using a C++ library with Julia 1.*.*
Plans regarding caching of generated code?
#2

Thank you for the info!
I’m so happy that Cxx.jl made it to the list!


#3

I am already excited for the next JuliaCon’s 1 second first plot live demo! No pressure!


#4

It is clear that this would be nice but not as revolutionary as efficient nested parallelism: if this is really achieved it would make Julia an unrivaled tool for hpc scientific computing.

In my domain, the ability to provide compiled scientific libs written in Julia and easy to use from large simulation application written in other language would also make a great difference because early adopters can start to interact with others and demonstrate the Julia’s power and productivity.


#5

It may be part of the compiler-time latency “caching more things” and “PackageCompiler,” but I would agree that “getting Julia to the point where it generates great .so files that can be used by other languages” would likely have a positive impact on Julia adoption.


#6

In addition to what Stefan said, the way to accomplish more compiler projects is:

  1. If you know of funding opportunities, please let me know and it will help us at Julia Computing grow our compiler team and execute more compiler projects.
  2. The Julia Lab at MIT is also another place where this could be a research topic for the right person.
  3. Influence tech companies to contribute engineering manpower, if they can’t contribute money. Many of the large tech firms do large open source contributions. The more vocal Julia users are, the more likely we will get contributions.
  4. Discuss with people at National Labs. The more they adopt Julia, the more we will get contributions and capabilities from that community.

Personally, I work on all the options above. I find that there are Julia users in every university, company, national lab, government, etc. but we need to be more vocal about asking for help - money or time.

-viral


#7

Thanks for laying it out. One thing I would like to ask though is, how can I help? I don’t have the time to really be a core contributor there (my efforts are probably best kept concentrated in DiffEq), but there’s gotta be something I can do? Fixing compile times is quite important to me so I would like to offer myself as an extra set of hands who knows packages which take a long time to compile. I opened Understanding the compile times of DifferentialEquations.jl and Attempting to Help to get feedback but haven’t gotten any. Hopefully there’s a way to decrease the burden on the few compiler workers in order to better distribute the work and get it done!


#8

Not sure whether this is a valid way of accomplishing more compiler projects, but I find the barrier to entry for the compiler somewhat higher than for Base.

This is because the data structures and source for Base are pretty apparent, well-documented and you can always play around in the REPL. In contrast, the compiler is harder to hack for relative newcomers.

Are there any guides for going into that? Guides for setting up a separate compile chain (at least the julia parts) for hacking, e.g. using Revise, ExtraCompiler such that I can make modifications that don’t break my session (beginning with insertion of prints until the debugger improves)? Descriptions of internal data-structures?

Is there a sensible way of helping with that (given the constraint that I currently don’t get how everything fits together)?

Re caching more things: If I finally passed the barrier-to-entry for compiler hacking, something I’d like to try is to create a persistent (cross session) cache. Idea would be that each compiled entity gets a collision-resistant hash of its inferred code, and we have a Merkle tree DAG where hash(A) incorporates hash(B) if A depends on B, i.e. if there is a backedge B->A. “compiled entity” means component of the dependency graph (need to collapse directed cycles). That way, much of llvm’s work and possibly some previous optimizer passes could theoretically be cached (last time I brought this up I was informed that linking is not ready for caching native code; and I still don’t get the world age system).


#9

Thanks for this writeup. There is something I feel could belong to this list: more detailed documentation on what the compiler is expected to optimize, and better tools for inspecting cases it didn’t.

I often run into cases where I wonder if a particular optimization or type inference didn’t happen because I didn’t code it the right way, or if the compiler is not handling that optimally (yet). The best strategy I know at the moment is asking here, and reading Base code for getting ideas, but the compiler is often a black box to me. @code_warntype is useful in finding out what happened, but sometimes it is more difficult to say why it happened and what I can do about it.


#10

I agree wholeheartedly, and I think the main blocker is the way multiple dispatch is resolved on some base module (usually Base) instead of being resolved by the caller.

the general outline , along with some overly emotional correspondences (mainly on my side) can be found in
this thread:

I also wrote a small POC using macro that generate a generated function … the added indirection gave me enough flexibility to prove that this is possible.
If there is interest I will “patch” the POC to work on julia version 1.0


#11

I can’t agree more. But another obstacle for this goal is that Julia runtime takes over all signal handlings (though I guess this is not what compiler does so I guess this is a bit off topic). It especially is problematic for SIGINT handler since you can’t respond to ctrl-c in your program as soon as you initialize Julia runtime. It would be nice to have an API to initialize Julia runtime without taking over all signal handlings, or at least not SIGINT (something like CPython’s Py_InitializeEx(0)).


Question regarding Julia's planned approach to composable, safe, and easy multithreading
#12

2 posts were split to a new topic: Making it easier to contribute to Julia


#13

I hear what you’re saying about multithreading, but threads have a bit of a bad reputation as tools of parallel computation (which I say from the position of someone who has used threads quite a bit, and successfully, even if I say so myself).

What I wonder about is the support of the other implementation mechanisms for parallel computing (remote channels and remote references). How solid is the code at the moment, and are any changes/improvements planned?

I suppose any work in this area would perhaps not fall under the “compiler” heading, so may be this is not the place to ask?


#14

I’d like also ask if the topic PackageCompiler will focus also on reducing the number of required dependencies - not linking libraries that are never called in the code, and full AOT with disabling JIT-runtime? This would support the use-case where a scientist develops the “bussiness-logic” dynamic library that will be deployed as part of a big C# / C++ library, without 500 MB dependencies (why ship FFTW when never used) and without unused stdlibs.


#15

update Cxx.jl is very important, there are many repositories that rely on it.


#16

Just noticed that there seems to be a new commit from Keno himself. Very exciting!

It would be really nice if more people get involved so that we are not relying on one very busy guy for absolutely everything.


#17

Agreed. @zsz00 and others who mentioned Cxx, checking out the branch and doing obvious fixes for test failures might earn brownie points for showing you care enough to help out. While Cxx surely has a lot of difficult-to-update components, many packages require a certain amount of annoying but fairly routine work that’s pretty easy for most people to help out with. If someone can take that burden off Keno, it might clear more time for him to focus on the more difficult pieces.


#18

Some developments on Cxx.jl are being reported in the comments of https://github.com/Keno/Cxx.jl/issues/390


#19

+1 for this, but

:laughing: :laughing: :laughing:


#20

I’ll cross-post this here for more broad attention:

@sdanisch has posted about how to compile Julia for wasm—if you’re looking to get involved or help with compiler work in a high-impact way that is fairly accessible, this is your chance! Also, think about how fun it will be when you get Julia running in a browser :grin: