Context
Working on an intrusive containers package, for users (like me) who need allocation free datastructures. Putting together the very beginnings of some expertiments I spotted some strange codegen behavior.
Julia Version
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Original Experiment
Consider a linked list node, next and prev are either references or terminal. Going with the canonical Union{T, Nothing}, assigning nothing to next generates unexpected code (assigning Node looks fine).
module Foo
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nothing}
value::Int64
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[97]:8 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rsi
subq $72, %rsp
movq %rdx, -40(%rbp)
movq (%rdx), %rsi
movq %rsi, -32(%rbp)
movq $341181352, -24(%rbp) # imm = 0x145603A8
movq $jl_system_image_data, -16(%rbp)
movabsq $"japi1_setproperty!_18314", %rax
leaq -32(%rbp), %rdx
movl $jl_system_image_data, %ecx
movl $3, %r8d
callq *%rax
; │ @ In[97]:9 within `setnext!'
movq %rsi, %rax
addq $72, %rsp
popq %rsi
popq %rbp
retq
nopl (%rax)
Adjustment A
Removing the value field makes codegen more optimal.
module Foo
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nothing}
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[98]:7 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rax
movq %rdx, -8(%rbp)
movq (%rdx), %rax
; │┌ @ Base.jl:21 within `setproperty!'
movq $jl_system_image_data, (%rax)
; │└
; │ @ In[98]:8 within `setnext!'
addq $8, %rsp
popq %rbp
retq
nopl (%rax)
Adjustment B
Replacing the prev field, not accessed in the tested function, not altering the size or layout of Node, nonetheless yields the same codegen improvement.
module Foo
mutable struct Nil end
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nil}
value::Int64
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[99]:9 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rax
movq %rdx, -8(%rbp)
movq (%rdx), %rax
; │┌ @ Base.jl:21 within `setproperty!'
movq $jl_system_image_data, (%rax)
; │└
; │ @ In[99]:10 within `setnext!'
addq $8, %rsp
popq %rbp
retq
nopl (%rax)
Question
What’s going on here? I feel like I might have struck upon some brittleness in the optimizer. I’ve validated what looks like strange and suboptimal codegen also suffers poor performance.