What functions are lowered to native instructions?

Many times, when seeing others’ code or Base, I run into comments like “This function is used instead of X because it has native instructions for many architectures”.

Is there any way to know which functions behave like this?

E.G.: Base.bitarray

Bit Twiddling Hacks is a really good guide to the processor side of this.

2 Likes

Every function in Julia is compiled to a sequence of native CPU instructions.

But I guess you are asking what functions compile to a single CPU instruction. That depends mainly on the instruction set of your CPU architecture — if the instruction exists, then LLVM will typically produce it from the most obvious corresponding high-level code. You can find many guides online to the instruction sets of various CPUs (though they are not light reading!).

Alternatively, you can use the @code_native macro to see the compiled code for a given function, and you can decipher this in simple cases to figure out whether a single instruction is produced. For example, in the case you linked:

julia> f(x) = x & (x-1)
f (generic function with 1 method)

julia> @code_native f(3)
	.section	__TEXT,__text,regular,pure_instructions
	.build_version macos, 12, 0
	.globl	_julia_f_332                    ## -- Begin function julia_f_332
	.p2align	4, 0x90
_julia_f_332:                           ## @julia_f_332
; ┌ @ REPL[10]:1 within `f`
	.cfi_startproc
## %bb.0:                               ## %top
; │┌ @ int.jl:340 within `&`
	blsrq	%rdi, %rax
; │└
	retq
	.cfi_endproc
; └
                                        ## -- End function
.subsections_via_symbols

which tells you that x & (x-1) for x::Int64 is compiled on x86_64 to a single BLSR (reset lowest set bit) instruction.

5 Likes

Yep, this is the kind of things I was referring to.

I knew about @code_native, but this allows to know for specific functions. Is there a comprehensive guide for which functions are lowered to single instructions?

What do you mean you comprehensive? A review of all functions in Base that compile to a single instruction?

Yep? :smile:

Maybe it is not possible, I was just curious. It could be quite educational material.

I doubt there is such a thing, and that’d also be architecture-dependent. The function f above compiles to two instructions on aarch64:

julia> @code_native debuginfo=:none f(3)
        .text
        .file   "f"
        .globl  julia_f_137                     // -- Begin function julia_f_137
        .p2align        3
        .type   julia_f_137,@function
julia_f_137:                            // @julia_f_137
        .cfi_startproc
// %bb.0:                               // %top
        sub     x8, x0, #1
        and     x0, x8, x0
        ret
.Lfunc_end0:
        .size   julia_f_137, .Lfunc_end0-julia_f_137
        .cfi_endproc
                                        // -- End function
        .section        ".note.GNU-stack","",@progbits

It could be cool. It seems like a great exercise for someone with a passion for the topic :slight_smile:

1 Like

It would be very difficult to guarantee that anything compiles to a single native instruction since Julia is used on many platforms. What compiles to a single x86_64 instruction may not compile to a single aarch64 instruction.

Going up a level of abstraction there is LLVM IR. You can view that via @code_llvm. In this case, we have a facility in Core.intrinsics.llvmcall:

https://docs.julialang.org/en/v1/base/c/#Core.Intrinsics.llvmcall

From there you can use LLVM intrinsics:
https://llvm.org/docs/LangRef.html

One example of how llvmcall is used can be found in SIMD.jl:

Another example is VectorizationBase.jl

More often than not the LLVM intrinsics correspond to one or a few native instructions.