Off-topic:
Could you please expand on how the compiler is terrible?
Just curious.
in short, it’s a reasonably good static compilier but a terrible JIT (or bascially it’s not a JIT)…
By saying that its a static compiler, what I mean is that it does not use any runtimme information to produce optimized code. (Here static information is basically anything in the type domain whereas runtime information being anything in the value domain). The advantage of this approach is that if the code is well writen, the performance will be predictably good, the compile time will also be very predictable since it’s only compilied on the first call. Being a new language, the advantage we have here is to design the language in a way that writing good code is easy and provide tools to help it.
OTOH, this also makes it a bad JIT (in some tranditional sense) compare to essentially all JIT you can find elsewhere. Basically we can only produce good code, possibly among the best of any code produced by JIT thanks to LLVM’s optimization passes, but can only do so for good julia code (i.e. if you follow performance tip). If you fail to do that, or write in a pattern that’s frequently seen in R/python/JS, the performance will be much slower compare to other JIT out there since the JIT for those languages has to deal with these code so they implements a lot of speculative or profiling based optimizations to get good performance. Jeff’s talk provided a really nice analogy. What we have is basically an asm.js
compiler (and including syntax to make writing it easier) and what other JITs have is, well, a JIT. If you feed asm.js to both, we’ll beat them with no doubt but if you feed normal js to both we’ll have no way to perform comparable to the real JITs out there. Hope it’s not too hard to see that having an asm.js compiler is very far from having aproper js JIT.
This was excellent, thank you very much for that.
(this was interesting and unrelated to the license speculation in the original thread, so I detached it)
(Thanks a lot! That seems reasonable. I will consider doing so next time - lots of off-the-cuff get thrown around that i do not want to interrupt the flow of a conversation, but would still like to know they answer)
Of course not that any of this means that we won’t eventually implement all the JIT tricks in the books
in order to speed up the quality of “bad” (in the sense of not following the perfomance tips/best practices).
It’s just not a priority, since you will never be able to get absolute peak performance that way anyway, and
it’s not too hard to write code that the compiler is good at making fast (plus we have tools to help you and
our users do tend to care a lot about peak performance).
When I worked on MacRuby, we came to realize that LLVM is both a blessing and a curse for a JIT compiled language. A blessing for all the obvious reasons; a curse because there are only so many levers you can pull once you’re at the level of LLVM bitcode. LLVM is the perfect JIT compiler starter kit, but it will never get you to the highest levels of performance.
At the extreme other end of the spectrum you have the gold standard of JIT compilers: LuaJIT. However, if you look at what Mike Pall has done with LuaJIT (custom byte code, tagged NaNs, etc.), there are a lot of sacrifices he’s made in order to be able to get the performance he has (portability and support for new features probably most prominent).
The middle ground that everyone seems to be headed toward is layering of a custom “intermediate language” between the code being compiled and the LLVM intrinsics. Swift and JavaScriptCore have both gone this route recently. Personally, given Julia’s unique take on macros, types, and generated functions, I think a Julia Intermediate Language would be very interesting. (Or perhaps a “Julia Intermediate Lowering Language”? A JILL?)
I would not fully agree with this. LLVM is certainly getting us to the highest level of performance for the “good code” (i.e. type stable etc.) that we actually care about. For code that a lot of other JIT needs to deal with, that’s certainly not the case and that’s why we are indeed a bad JIT.
We do, it’s the typed AST. We have a growing number of optimizations on it in type inference.
Formalizing it more, stabilizing it and calling JILL would be a nice step though. Maybe a good post-1.0 project.
It’s worth noting that Swift generates and types its AST and then generates SIL for further optimization (see here). It’s also worth noting that SIL is SSA-form. The idea being that many of the same SSA-enabled optimizations that LLVM can do also can be applied to SIL, but SIL retains more of the high-level SWIFT semantics, which optimizations can take advantage of.
For example:
julia> function foo()
a = 10
a += 15
b = a
b += 25
b
end
foo (generic function with 1 method)
julia> @code_typed foo()
CodeInfo(:(begin
a = 10 # line 3:
a = (Base.add_int)(a, 15)::Int64 # line 4:
b = a # line 5:
b = (Base.add_int)(b, 25)::Int64 # line 6:
return b
end))=>Int64
julia> @code_llvm foo()
define i64 @julia_foo_60827() #0 !dbg !5 {
top:
ret i64 50
}
julia> mutable struct Bar
val::Int
end
julia> function bar()
a = Bar(10)
a.val += 15
b = a.val
b += 25
b
end
bar (generic function with 1 method)
julia> @code_typed bar()
CodeInfo(:(begin
a = $(Expr(:new, :(Main.Bar), 10)) # line 3:
SSAValue(0) = (Base.add_int)((Core.getfield)(a, :val)::Int64, 15)::Int64
(Core.setfield!)(a, :val, SSAValue(0))::Int64 # line 4:
b = (Core.getfield)(a, :val)::Int64 # line 5:
b = (Base.add_int)(b, 25)::Int64 # line 6:
return b
end))=>Int64
julia> @code_llvm bar()
define i64 @julia_bar_60849() #0 !dbg !5 {
top:
%0 = call i8**** @jl_get_ptls_states() #2
%1 = bitcast i8**** %0 to i8*
%2 = call i8** @jl_gc_pool_alloc(i8* %1, i32 1384, i32 16)
%3 = getelementptr i8*, i8** %2, i64 -1
%4 = bitcast i8** %3 to i8***
store i8** inttoptr (i64 4622770160 to i8**), i8*** %4, align 8
%5 = bitcast i8** %2 to i64*
store i64 25, i64* %5, align 16
ret i64 50
}
LLVM does a really good job of inlining and optimizing setfield!
, but since setfield!
is just another function call from LLVM’s perspective (and because it can’t know that memory accessed by jl_f_setfield
isn’t accessed by another thread, etc.), it’s incapable of doing a full optimization of the later case. However, if we had a SSA-form IR, we could’ve done the correct constant folding and escape analysis before lowering to LLVM, so that both functions would result in the same machine code (as they should).
Anyway, this has strayed pretty far off topic at this point… The example I gave is admittedly a bit contrived, but I wanted to point out that a formal IR could still be advantageous, even with optimizations on the typed AST and LLVM.
We are actually capable of doing the setfield! optimization (there’s a PR for it) and it’s not related to SSA form. The AST will certainly be changed a lot (even before 1.0) so that doing these optimizations will be easier (very linear IR).
I don’t think we have to introduce another layer though (or in some sense we already do, since we run optimization passes after inferring all the types, they just have the same representation).
And even more OT, I believe @Keno has plan to do that optimization in LLVM too. It’s what LLVM can do already for malloc
/free
.
I’ve been thinking, do we need an interpreter for Julia. LLVM is a huge dependency, and static executables need it still (and the GC). And eval is a problem if you would drop LLVM.
We precompile most modules, and can let an interpreter take care of the rest (e.g. eval)? Or maybe compile code to Lua (or Python) is an option?