Is there a way to get `mul_hi`/`umulh`?

Lilith · March 25, 2024, 10:28pm

Similar to (identical to) `mul_hi` in Julia?, is there a way of accessing the high bits of a product between two bit integers?

I want a function mul_hi such that widen(mul_hi(x, y)) << 8sizeof(T) + x*y == widemul(x,y) for x::T, y::T, and I want it to use the native umulh instruction where available.

Oscar_Smith · March 25, 2024, 10:34pm

I don’t believe so. It probably would be smart to add. Right now, we mostly rely on LLVM figuring it out from widemul where necessar.

mkitti · March 26, 2024, 1:18pm

You might be able to invoke a LLVM intrinsic via llvmcall. Do you know of language with mul_hi? We can reverse engineer it from there.

Lilith · March 26, 2024, 1:32pm

That works for 64-bit, but not 128-bit.

ulia> function mul_hi(x::T, y::T) where T <: Base.BitInteger
           xy = widemul(x, y)
           (xy >> 8sizeof(T)) % T
       end
mul_hi (generic function with 2 methods)

julia> @b mul_hi($(rand(UInt64)), $(rand(UInt64)))
1.982 ns

julia> @b mul_hi($(rand(UInt128)), $(rand(UInt128)))
193.885 ns (12 allocs: 224 bytes)

julia> @code_native mul_hi(rand(UInt64), rand(UInt64))
        .text
        .file   "mul_hi"
        .globl  julia_mul_hi_33966              // -- Begin function julia_mul_hi_33966
        .p2align        2
        .type   julia_mul_hi_33966,@function
julia_mul_hi_33966:                     // @julia_mul_hi_33966
; Function Signature: mul_hi(UInt64, UInt64)
; ┌ @ REPL[264]:1 within `mul_hi`
// %bb.0:                               // %top
; │ @ REPL[264] within `mul_hi`
        //DEBUG_VALUE: mul_hi:x <- $x0
        //DEBUG_VALUE: mul_hi:x <- $x0
        //DEBUG_VALUE: mul_hi:y <- $x1
        //DEBUG_VALUE: mul_hi:y <- $x1
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
        mov     x29, sp
; │ @ REPL[264]:3 within `mul_hi`
; │┌ @ int.jl:534 within `>>` @ int.jl:528
        umulh   x0, x1, x0
; │└
; │┌ @ int.jl:544 within `rem`
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        ret
.Lfunc_end0:
        .size   julia_mul_hi_33966, .Lfunc_end0-julia_mul_hi_33966
; └└
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits

Julia feature request

Lilith · March 26, 2024, 1:42pm

After a quick look I was unable to find an LLVM intrinsic for this.

mkitti · March 28, 2024, 4:12am

The main reference I see for an existing mul_hi is from OpenCL:

https://registry.khronos.org/OpenCL/sdk/1.1/docs/man/xhtml/mul_hi.html

I see some references to smul_lohi and umul_lohi in the the LLVM documentation:

https://llvm.org/doxygen/namespacellvm_1_1ISD.html#a22ea9cec080dd5f4f47ba234c2f59110a1354c6f8508d6cd697dc89a5d9a52dfd

Topic		Replies	Views
`mul_hi` in Julia? General Usage	18	1077	September 2, 2018
Functions for low-level arithmetic Internals & Design	5	1085	March 29, 2017
Split an Int128 to two Int64 Performance	2	2266	July 22, 2019
Using CLMUL instruction General Usage	10	1229	May 6, 2019
Modular multiplication without overflow Performance	32	3301	November 23, 2022

Is there a way to get `mul_hi`/`umulh`?

Related topics