So now you get same kind of speed up than Go does get? Perfect!
All that was very interesting. Many thanks for the discussion.
So now you get same kind of speed up than Go does get? Perfect!
All that was very interesting. Many thanks for the discussion.
Yes!
I have also learned a lot from you and @c42f. Thank you so much.
What I wonder now is why Ref{Int}
is so slow here.
Found the answer in another post.
After specifying the Ref{Int}
fields as Base.RefValue{Int}
, I got almost the same performance gains (but still slower than mutable struct
+ Int
).
=== Using compiler ===
10.557 s (583631370 allocations: 19.09 GiB)
For the record, here is my quick-and-dirty bash command (updated to get filename and line number in front) to analyze .mem files
(cd mems; for F in *.mem; do awk '{printf "%-20s %4.4d %s\n",FILENAME,NR,$0}' $F;done) | sort -nr -k3 >r.anamem.txt
Here are the results (2nd col. is line number in filename, 3rd col. is number of allocs)
vm.jl.408.mem 0091 11066672288 const_id = read_uint16(ins[ip+1:ip+2]) + 1
vm.jl.408.mem 0116 6688797472 pos = read_uint16(ins[ip+1:ip+2])
vm.jl.408.mem 0032 6415057712 vm.stack[vm.sp[]] = obj
vm.jl.408.mem 0103 6323812608 execute_binary_operation!(vm, op, left, right)
vm.jl.408.mem 0093 4742859504 push!(vm, vm.constants[const_id])
vm.jl.408.mem 0148 4026202560 push!(vm, vm.stack[frame.base_ptr+local_id])
vm.jl.408.mem 0209 1433313696 push!(vm, return_value)
vm.jl.408.mem 0324 477771248 frame = Frame(cl, vm.sp[] - arg_count)
lexer.jl.408.mem 0031 13440 ch, state = l.next
lexer.jl.408.mem 0135 5088 while isspace(read_char(l))
lexer.jl.408.mem 0103 3520 push!(chars, read_char(l))
vm.jl.408.mem 0030 3104 push!(vm.stack, obj
...
Sorry its Linux script, I am an old linuxer. I should have written that in Julia β¦ next time surely
I also just saw this (found from reddit r/programming). It seems some optimisations are similar to the ones you got β¦
Thanks for this link! I also found something interesting in the Episode IV. The author used Crystal macros to simplify some duplicate code, like below.
I think this is definitely what I can try next.
BTW, I have just tried compiling the project to a standalone executable (you can have a try with make build
), using PackageCompiler.jl.
The compilation takes a long time, and the output size is very large. But I can run in my command line monkey run hello_world.mo
or monkey repl
, which is really nice.
One thing I found is the compiled executable ran much slower than directly using Julia (about 1.5x slower). Donβt know why.
Ah yes excellent find. I should probably have noticed this
This pattern in the standard library (and elsewhere) of using an abstract type like Ref{Int}
as a constructor is nice for brevity in normal code (Ref(1)
and Ref{Int}()
both work). But itβs rather too easy to make the mistake of putting the same type names into a struct
!
Lucikly JET (and also @code_warntype
) will find this for you. For example
julia> using JET
julia> struct A
x::Ref{Int}
end
julia> a = A(Ref(1))
A(Base.RefValue{Int64}(1))
julia> f(a) = a.x[]
f (generic function with 1 method)
julia> @report_opt f(a)
βββββ 1 possible error found βββββ
β @ REPL[11]:1 Base.getindex(%1)
β runtime dispatch detected: Base.getindex(%1::Ref{Int64})
βββββββββββββββ