Context
Working on an intrusive containers package, for users (like me) who need allocation free datastructures. Putting together the very beginnings of some expertiments I spotted some strange codegen behavior.
Julia Version
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Original Experiment
Consider a linked list node, next
and prev
are either references or terminal. Going with the canonical Union{T, Nothing}
, assigning nothing
to next
generates unexpected code (assigning Node
looks fine).
module Foo
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nothing}
value::Int64
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[97]:8 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rsi
subq $72, %rsp
movq %rdx, -40(%rbp)
movq (%rdx), %rsi
movq %rsi, -32(%rbp)
movq $341181352, -24(%rbp) # imm = 0x145603A8
movq $jl_system_image_data, -16(%rbp)
movabsq $"japi1_setproperty!_18314", %rax
leaq -32(%rbp), %rdx
movl $jl_system_image_data, %ecx
movl $3, %r8d
callq *%rax
; │ @ In[97]:9 within `setnext!'
movq %rsi, %rax
addq $72, %rsp
popq %rsi
popq %rbp
retq
nopl (%rax)
Adjustment A
Removing the value
field makes codegen more optimal.
module Foo
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nothing}
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[98]:7 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rax
movq %rdx, -8(%rbp)
movq (%rdx), %rax
; │┌ @ Base.jl:21 within `setproperty!'
movq $jl_system_image_data, (%rax)
; │└
; │ @ In[98]:8 within `setnext!'
addq $8, %rsp
popq %rbp
retq
nopl (%rax)
Adjustment B
Replacing the prev
field, not accessed in the tested function, not altering the size or layout of Node
, nonetheless yields the same codegen improvement.
module Foo
mutable struct Nil end
mutable struct Node
next::Union{Node, Nothing}
prev::Union{Node, Nil}
value::Int64
end
function setnext!(x::Node, next::Union{Node, Nothing})
x.next = next
x
end
end
code_native(Foo.setnext!, (Foo.Node, Nothing))
Outputs:
; ┌ @ In[99]:9 within `setnext!'
pushq %rbp
movq %rsp, %rbp
pushq %rax
movq %rdx, -8(%rbp)
movq (%rdx), %rax
; │┌ @ Base.jl:21 within `setproperty!'
movq $jl_system_image_data, (%rax)
; │└
; │ @ In[99]:10 within `setnext!'
addq $8, %rsp
popq %rbp
retq
nopl (%rax)
Question
What’s going on here? I feel like I might have struck upon some brittleness in the optimizer. I’ve validated what looks like strange and suboptimal codegen also suffers poor performance.