What functions are lowered to native instructions?

nandoconde · September 24, 2022, 12:10pm

Many times, when seeing others’ code or Base, I run into comments like “This function is used instead of X because it has native instructions for many architectures”.

Is there any way to know which functions behave like this?

E.G.: Base.bitarray

Oscar_Smith · September 24, 2022, 12:53pm

Bit Twiddling Hacks is a really good guide to the processor side of this.

stevengj · September 24, 2022, 1:44pm

Every function in Julia is compiled to a sequence of native CPU instructions.

But I guess you are asking what functions compile to a single CPU instruction. That depends mainly on the instruction set of your CPU architecture — if the instruction exists, then LLVM will typically produce it from the most obvious corresponding high-level code. You can find many guides online to the instruction sets of various CPUs (though they are not light reading!).

Alternatively, you can use the @code_native macro to see the compiled code for a given function, and you can decipher this in simple cases to figure out whether a single instruction is produced. For example, in the case you linked:

julia> f(x) = x & (x-1)
f (generic function with 1 method)

julia> @code_native f(3)
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 12, 0
	.globl	_julia_f_332                    ## -- Begin function julia_f_332
	.p2align	4, 0x90
_julia_f_332:                           ## @julia_f_332
; ┌ @ REPL[10]:1 within `f`
	.cfi_startproc
## %bb.0:                               ## %top
; │┌ @ int.jl:340 within `&`
	blsrq	%rdi, %rax
; │└
	retq
	.cfi_endproc
; └
                                        ## -- End function
.subsections_via_symbols

which tells you that x & (x-1) for x::Int64 is compiled on x86_64 to a single BLSR (reset lowest set bit) instruction.

nandoconde · September 24, 2022, 3:20pm

Yep, this is the kind of things I was referring to.

I knew about @code_native, but this allows to know for specific functions. Is there a comprehensive guide for which functions are lowered to single instructions?

johnmyleswhite · September 24, 2022, 3:39pm

What do you mean you comprehensive? A review of all functions in Base that compile to a single instruction?

nandoconde · September 24, 2022, 3:51pm

Yep?

Maybe it is not possible, I was just curious. It could be quite educational material.

giordano · September 24, 2022, 4:01pm

I doubt there is such a thing, and that’d also be architecture-dependent. The function f above compiles to two instructions on aarch64:

julia> @code_native debuginfo=:none f(3)
        .text
        .file   "f"
        .globl  julia_f_137                     // -- Begin function julia_f_137
        .p2align        3
        .type   julia_f_137,@function
julia_f_137:                            // @julia_f_137
        .cfi_startproc
// %bb.0:                               // %top
        sub     x8, x0, #1
        and     x0, x8, x0
        ret
.Lfunc_end0:
        .size   julia_f_137, .Lfunc_end0-julia_f_137
        .cfi_endproc
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits

johnmyleswhite · September 24, 2022, 4:05pm

It could be cool. It seems like a great exercise for someone with a passion for the topic

mkitti · September 24, 2022, 5:54pm

It would be very difficult to guarantee that anything compiles to a single native instruction since Julia is used on many platforms. What compiles to a single x86_64 instruction may not compile to a single aarch64 instruction.

Going up a level of abstraction there is LLVM IR. You can view that via @code_llvm. In this case, we have a facility in Core.intrinsics.llvmcall:

https://docs.julialang.org/en/v1/base/c/#Core.Intrinsics.llvmcall

From there you can use LLVM intrinsics:
https://llvm.org/docs/LangRef.html

One example of how llvmcall is used can be found in SIMD.jl:

github.com

eschnett/SIMD.jl/blob/master/src/LLVM_intrinsics.jl

# LLVM operations and intrinsics
module Intrinsics

# Note, that in the functions below, some care needs to be taken when passing
# Julia Bools to LLVM. Julia passes Bools as LLVM i8 but expect them to only
# have the last bit as non-zero. Failure to comply with this can give weird errors
# like false !== false where the first false is the result of some computation.

# Note, no difference is made between Julia usigned integers and signed integers
# when passed to LLVM. It is up to the caller to make sure that the correct
# intrinsic is called (e.g uitofp vs sitofp).

import ..SIMD: SIMD, VE, LVec, FloatingTypes
# Include Bool in IntegerTypes
const IntegerTypes = Union{SIMD.IntegerTypes, Bool}

const d = Dict{DataType, String}(
    Bool         => "i8",
    Int8         => "i8",
    Int16        => "i16",

This file has been truncated. show original

Another example is VectorizationBase.jl

github.com

JuliaSIMD/VectorizationBase.jl/blob/master/src/llvm_intrin/unary_ops.jl


function sub_quote(W::Int, T::Symbol, fast::Bool)::Expr
  vtyp = vtype(W, T)
  instrs = "%res = fneg $(fast_flags(fast)) $vtyp %0\nret $vtyp %res"
  quote
    $(Expr(:meta, :inline))
    Vec($LLVMCALL($instrs, _Vec{$W,$T}, Tuple{_Vec{$W,$T}}, data(v)))
  end
end

@generated vsub(v::Vec{W,T}) where {W,T<:Union{Float32,Float64}} =
  sub_quote(W, JULIA_TYPES[T], false)
@generated vsub_fast(v::Vec{W,T}) where {W,T<:Union{Float32,Float64}} =
  sub_quote(W, JULIA_TYPES[T], true)

@inline vsub(v) = -v
@inline vsub_fast(v) = Base.FastMath.sub_fast(v)
@inline vsub(v::Vec{<:Any,<:NativeTypes}) = vsub(zero(v), v)
@inline vsub_fast(v::Vec{<:Any,<:UnsignedHW}) = vsub(zero(v), v)
@inline vsub_fast(v::Vec{<:Any,<:NativeTypes}) = vsub_fast(zero(v), v)

This file has been truncated. show original

More often than not the LLVM intrinsics correspond to one or a few native instructions.

Topic		Replies	Views
Julia equivalent of C compiler intrinsics? General Usage	23	3082	November 8, 2018
Comparing Julia functions General Usage question , functions	2	457	June 19, 2022
Compilation options for Downfall mitigation Performance question	4	878	October 25, 2023
Why does `code_native` output 32-bit assembly in 64-bit Julia? General Usage	9	1441	July 10, 2018
LLVM code changes if code is wrapped in function Performance	2	328	March 15, 2023

What functions are lowered to native instructions?

Related topics