MS or Ph.D. level problems for Julia compilation?


#1

I’m giving a talk on Julia compilation for a CS seminar course on compilers at UIUC. I have only 50 minutes, so I can’t go very deep, so I’ll mainly cover the compilation chain and type inference (and the type stability issue). Are there any MS or PhD worthy compilation problems I should mention?

In updating my examples from Julia 0.5 to 0.6 for code_warntype and code_llvm, I see the compiler is getting stronger. Nice work!


#2

I’m not a programming-languages person, but over the past few years various people including me raised a few issues in this discussion group and its predecessor and have been told that these are hard problems.

  1. “Arrow” types, or, in other words, making methods first-class objects that are typed according to their input arguments and return arguments. This comes up often in this group, but it is apparently difficult to incorporate into the Julia type system for reasons that aren’t clear to me.

  2. Aliasing. It is easy to write buggy code, e.g for matrix-matrix multiplication, if the destination matrix overlaps in memory with one of the operands. Julia doesn’t provide any automation to detect or prevent aliasing.

  3. Pool allocation. Many scientific codes have a “main loop” in which vectors and matrices are allocated and deallocated according to a fixed pattern (although the actual numerical entries change). It would boost performance if the run-time system, possibly with help from the programmer or compiler, could detect that this is occurring and reserve in advance a big block of memory and place all the vectors and matrices in the loop at fixed offsets.

  4. Sparse matrix planning. This is similar the previous item: many sparse matrix codes have a main loop in which the nonzero structure of sparse matrices and vectors is fixed but the numerical entries change from one iteration to the next. In this case, it would help performance if all the correct destination addresses for the various entries of the sparse matrices could be computed once and for all at the start of the loop.