So now you get same kind of speed up than Go does get? Perfect!
All that was very interesting. Many thanks for the discussion. ![]()
So now you get same kind of speed up than Go does get? Perfect!
All that was very interesting. Many thanks for the discussion. ![]()
Yes!
I have also learned a lot from you and @c42f. Thank you so much.
What I wonder now is why Ref{Int} is so slow here.
Found the answer in another post.
After specifying the Ref{Int} fields as Base.RefValue{Int}, I got almost the same performance gains (but still slower than mutable struct + Int).
=== Using compiler ===
10.557 s (583631370 allocations: 19.09 GiB)
For the record, here is my quick-and-dirty bash command (updated to get filename and line number in front) to analyze .mem files
(cd mems; for F in *.mem; do awk '{printf "%-20s %4.4d %s\n",FILENAME,NR,$0}' $F;done) | sort -nr -k3 >r.anamem.txt
Here are the results (2nd col. is line number in filename, 3rd col. is number of allocs)
vm.jl.408.mem 0091 11066672288 const_id = read_uint16(ins[ip+1:ip+2]) + 1
vm.jl.408.mem 0116 6688797472 pos = read_uint16(ins[ip+1:ip+2])
vm.jl.408.mem 0032 6415057712 vm.stack[vm.sp[]] = obj
vm.jl.408.mem 0103 6323812608 execute_binary_operation!(vm, op, left, right)
vm.jl.408.mem 0093 4742859504 push!(vm, vm.constants[const_id])
vm.jl.408.mem 0148 4026202560 push!(vm, vm.stack[frame.base_ptr+local_id])
vm.jl.408.mem 0209 1433313696 push!(vm, return_value)
vm.jl.408.mem 0324 477771248 frame = Frame(cl, vm.sp[] - arg_count)
lexer.jl.408.mem 0031 13440 ch, state = l.next
lexer.jl.408.mem 0135 5088 while isspace(read_char(l))
lexer.jl.408.mem 0103 3520 push!(chars, read_char(l))
vm.jl.408.mem 0030 3104 push!(vm.stack, obj
...
Sorry its Linux script, I am an old linuxer. I should have written that in Julia β¦ next time surely
I also just saw this (found from reddit r/programming). It seems some optimisations are similar to the ones you got β¦
Thanks for this link! I also found something interesting in the Episode IV. The author used Crystal macros to simplify some duplicate code, like below.
I think this is definitely what I can try next.
BTW, I have just tried compiling the project to a standalone executable (you can have a try with make build), using PackageCompiler.jl.
The compilation takes a long time, and the output size is very large. But I can run in my command line monkey run hello_world.mo or monkey repl, which is really nice.
One thing I found is the compiled executable ran much slower than directly using Julia (about 1.5x slower). Donβt know why.
Ah yes excellent find. I should probably have noticed this ![]()
This pattern in the standard library (and elsewhere) of using an abstract type like Ref{Int} as a constructor is nice for brevity in normal code (Ref(1) and Ref{Int}() both work). But itβs rather too easy to make the mistake of putting the same type names into a struct!
Lucikly JET (and also @code_warntype) will find this for you. For example
julia> using JET
julia> struct A
x::Ref{Int}
end
julia> a = A(Ref(1))
A(Base.RefValue{Int64}(1))
julia> f(a) = a.x[]
f (generic function with 1 method)
julia> @report_opt f(a)
βββββ 1 possible error found βββββ
β @ REPL[11]:1 Base.getindex(%1)
β runtime dispatch detected: Base.getindex(%1::Ref{Int64})
βββββββββββββββ