Calling AVX-512 intrinsics from Julia

mkitti · July 3, 2023, 8:17pm

I found an icelake-server machine, and I sorted out a few things from your original code.

__m512i needs to be const when used globally
The return type is <64 x i1> and not i64 so we need an explicit bitcast.

julia> import Core.Intrinsics.llvmcall

julia> const __m512i = NTuple{64, VecElement{Int8}}
NTuple{64, VecElement{Int8}}

julia> vpshufbitqmb_512(a,b) = Core.Intrinsics.llvmcall(("""
       declare <64 x i1> @llvm.x86.avx512.vpshufbitqmb.512(<64 x i8>, <64 x i8>)
       define i64 @i64_vpshufbitqmb_512(<64 x i8> %a, <64 x i8> %b) {
         %tmp = call <64 x i1> @llvm.x86.avx512.vpshufbitqmb.512(<64 x i8> %a, <64 x i8> %b)
         %tmp2 = bitcast <64 x i1> %tmp to i64
         ret i64 %tmp2
       }
       ""","i64_vpshufbitqmb_512"), Int64, Tuple{__m512i, __m512i}, a, b)
vpshufbitqmb_512 (generic function with 1 method)

julia> x = __m512i(ntuple(_ -> rand(Int8), 64));

julia> p = __m512i(ntuple(_ -> rand(Int8), 64));

julia> vpshufbitqmb_512(x,p)
-68453262247164531

julia> versioninfo()
Julia Version 1.9.1
Commit 147bdf428c (2023-06-07 08:27 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 56 × Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, icelake-server)
  Threads: 1 on 112 virtual cores

julia> Base.BinaryPlatforms.CPUID.test_cpu_feature(Base.BinaryPlatforms.CPUID.JL_X86_avx512bitalg)
true

I figured this out by looking at the following examples.

Topic		Replies	Views
C routine uses AVX intrinsics General Usage interoperability , c	15	1468	September 26, 2022
Julia equivalent of C compiler intrinsics? General Usage	23	3082	November 8, 2018
`llvmcall` error "llvmcall only supports intrinsic calls" for scalar integer intrinsics on Apple M2 General Usage question , llvm	2	60	January 27, 2025
Can you call Julia methods with LLVM call? Performance	15	2124	October 1, 2022
Bit manipulations with llvmcall give strange results General Usage llvm	11	318	March 9, 2024

Calling AVX-512 intrinsics from Julia

Related topics