No constant expression elimination for e.g. '2^24-1'

I found differences in code generation for simple constant expressions: 1<<24-1 generates a constant, 2^24-1 generates a quite expensive C function call. Code example below.

I did expect Julia to replace both expressions by the constant 0xFFFFFF during optimization/code generation. Do we have an old issue wrongly marked as closed?

The same question was put 2015:


Its answer links to https://github.com/JuliaLang/julia/issues/14324 which was closed in 2017.

My example (Julia 1.0.5, Windows 10, 64bit):



julia> function uintmulshift()
           v :: UInt64 = (1<<24)-1
           end
uintmulshift (generic function with 1 method)
julia> @code_native uintmulshift()
        .text
; Function uintmulshift {
; Location: none:3
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $16777215, %eax         # imm = 0xFFFFFF
        popq    %rbp
        retq
        nopl    (%rax,%rax)
;}
julia> function uintmulexp()
           v :: UInt64 = 2^24-1
           end
uintmulexp (generic function with 1 method)
julia> @code_native uintmulexp()
        .text
; Function uintmulexp {
; Location: none:3
        pushq   %rbp
        movq    %rsp, %rbp
; Function literal_pow; {
; Location: none
; Function macro expansion; {
; Location: none
; Function ^; {
; Location: intfuncs.jl:220
        subq    $32, %rsp
        movabsq $power_by_squaring, %rax
        movl    $2, %ecx
        movl    $24, %edx
        callq   *%rax
;}}}
; Function convert; {
; Location: number.jl:7
; Function Type; {
; Location: boot.jl:722
; Function toUInt64; {
; Location: boot.jl:692
; Function check_top_bit; {
; Location: boot.jl:581
; Function is_top_bit_set; {
; Location: boot.jl:571
        addq    $-1, %rax
;}
        js      L42
;}}}}
        addq    $32, %rsp
        popq    %rbp
        retq
; Function convert; {
; Location: number.jl:7
; Function Type; {
; Location: boot.jl:722
; Function toUInt64; {
; Location: boot.jl:692
; Function check_top_bit; {
; Location: boot.jl:581
L42:
        movabsq $throw_inexacterror, %r9
        movl    $227233368, %ecx        # imm = 0xD8B4E58
        movl    $234489840, %edx        # imm = 0xDFA07F0
        movq    %rax, %r8
        callq   *%r9
        ud2
        ud2
        nopl    (%rax,%rax)
;}}}}}
julia>

Interesting. I get same under 1.3.0-rc4.1 under Windows 10:

julia> @code_native (() -> 2^24-1)()
        .text
; β”Œ @ none:1 within `#19'
        pushq   %rbp
        movq    %rsp, %rbp
; β”‚β”Œ @ none within `literal_pow'
; β”‚β”‚β”Œ @ none within `macro expansion'
; β”‚β”‚β”‚β”Œ @ intfuncs.jl:221 within `^'
        subq    $32, %rsp
        movabsq $power_by_squaring, %rax
        movl    $2, %ecx
        movl    $24, %edx
        callq   *%rax
; β”‚β””β””β””
; β”‚β”Œ @ int.jl:52 within `-'
        addq    $-1, %rax
; β”‚β””
        addq    $32, %rsp
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; β””

julia> @code_native (() -> 1<<24-1)()
        .text
; β”Œ @ none:1 within `#21'
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $16777215, %eax         # imm = 0xFFFFFF
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; β””

julia> 

Simpler example. 2^4 is okay. 2^5 is right out.:

julia> @code_native (() -> 2^4)()
        .text
; β”Œ @ none:1 within `#37'
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $16, %eax
        popq    %rbp
        retq
        nopl    (%rax,%rax)
; β””

julia> @code_native (() -> 2^5)()
        .text
; β”Œ @ none:1 within `#39'
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $2, %ecx
        movl    $2, %eax
        nop
; β”‚β”Œ @ none within `literal_pow'
; β”‚β”‚β”Œ @ none within `macro expansion'
; β”‚β”‚β”‚β”Œ @ intfuncs.jl:221 within `^'
; β”‚β”‚β”‚β”‚β”Œ @ intfuncs.jl:209 within `power_by_squaring'
; β”‚β”‚β”‚β”‚β”‚β”Œ @ int.jl:54 within `*'
L16:
        imulq   %rax, %rax
; β”‚β”‚β”‚β”‚β””β””
; β”‚β”‚β”‚β”‚β”Œ @ int.jl:424 within `power_by_squaring'
        addq    $-1, %rcx
; β””β””β””β””β””
; β”Œ @ intfuncs.jl:208 within `#39'
        jg      L16
        addq    %rax, %rax
; β””
; β”Œ @ none:1 within `#39'
        popq    %rbp
        retq
        nop
; β””

julia>

Maybe the complexity of power_by_squaring somehow makes it hard for LLVM to do the elimination?

Although it manages to for 2 raised to the powers 4, 7,8, 15,16, 31,32, 63,64.

Maybe add this to the Performance tips doc?

It would probably make sense for ^ to detect common bases (like 2, 2.0, β„―, and 10.0) and jump out to specialized routines in those cases.

In general, integer exponentiation is just doing a β€œsmart” number of squarings, so that’s why small/easy exponents are more likely to constant fold than bigger ones. As far as I’m aware there are no guarantees on how constant propagation works β€” it’s β€œjust” an optimization.

2 Likes

A simple-minded approach to compute 2^24-1 at compile time is to use Base.literal_pow:

julia> @inline Base.literal_pow(::typeof(^), x::Base.HWNumber, ::Val{p}) where p =
           if p < 0
               Base.literal_pow(^, inv(x), Val(p))
           else
               Base.literal_pow(^, x, Val(p-1)) * x
           end;

julia> f() = 2^24 - 1;

julia> @code_llvm f()

;  @ REPL[2]:1 within `f'
define i64 @julia_f_17031() {
top:
  ret i64 16777215
}

ATM, literal_pow is only defined up to 3:

I wonder if it makes sense to increase the limit.

The compiler should also definitely be able to constant fold 2^24 no matter how it is defined. The fact that it doesn’t is a bit of a compiler bug, though of course it may not be entirely trivial to fix.

7 Likes

Sorry, I did the bad thing and knew about a bug,
but never openned an issue for it

2 Likes