A good measure of performance is length of generated machine code. As a package writer, I would like to monitor how well the functions I am writing are in terms of concise assembly. This would improve the performance of my packages and loading times in the long-term.
For a practical example, consider the following alternatives of computing the maximum between two numbers:
At first, I guessed that the former option would be more efficient because tuples are immutable and the compiler could do all sorts of things with them. However, when I generate the code, this is what I get:
julia> @code_native maximum((1,2)) .text Filename: reduce.jl pushq %rbp movq %rsp, %rbp pushq %r15 pushq %r14 pushq %r12 pushq %rbx subq $64, %rsp movq %rdi, %r15 movq %fs:0, %rbx addq $-10888, %rbx # imm = 0xD578 leaq -64(%rbp), %r14 vxorps %ymm0, %ymm0, %ymm0 vmovups %ymm0, -64(%rbp) movq $10, -88(%rbp) movq (%rbx), %rax movq %rax, -80(%rbp) leaq -88(%rbp), %rax movq %rax, (%rbx) movq $0, -72(%rbp) Source line: 454 movabsq $140402333763728, %r12 # imm = 0x7FB1F73AC890 leaq 398277240(%r12), %rax movq %rax, -64(%rbp) leaq 398277144(%r12), %rax movq %rax, -56(%rbp) leaq 398277080(%r12), %rax movq %rax, -48(%rbp) movabsq $jl_gc_pool_alloc, %rax movl $1456, %esi # imm = 0x5B0 movl $32, %edx movq %rbx, %rdi vzeroupper callq *%rax leaq 397451744(%r12), %rcx movq %rcx, -8(%rax) vmovups (%r15), %xmm0 vmovups %xmm0, (%rax) movq %rax, -40(%rbp) movabsq $jl_invoke, %rax movl $4, %edx movq %r12, %rdi movq %r14, %rsi callq *%rax movq %rax, -72(%rbp) movq (%rax), %rax movq -80(%rbp), %rcx movq %rcx, (%rbx) addq $64, %rsp popq %rbx popq %r12 popq %r14 popq %r15 popq %rbp retq nopw %cs:(%rax,%rax)
julia> @code_native maximum([1,2]) .text Filename: reduce.jl pushq %rbp movq %rsp, %rbp Source line: 454 callq _mapreduce popq %rbp retq nopl (%rax,%rax)
So clearly, I cannot trust my intuition in many other cases. What is the workflow you suggest for tracking these types of changes? Is there any package to facilitate diagnostics? I wonder if something like
__precompile()__ could be added to warn package writers whenever a function is re-implemented and causes giant machine code increase.
Related to this issue, it would be nice if I could start Julia in a “warn_type” mode. That is, every single command I type in the REPL gives me a warning if there is type instability. Adding
@code_warntype everywhere by hand is not very efficient from the perspective of someone that is only interested in implemented a cool new feature in the package. I’d rather have the warning from the start than having to go back in a second pass to optimize code.