Jl_call - function call latency

AntoineK · November 27, 2023, 10:14am

Hello,
My first benchmarks show that calling the jl_call function comes with a latency of 100 nanoseconds (measured by calling an empty Julia function) on a Xeon Gold processor.
Is it expected ?
In Julia, BenchmarksTool gives a computation time of 50 ns for the julia function I intend to embed so paying an extra 100 ns in the C code is annoying. Note that my loop is a few microsecs so I am looking at any performance improvements.

mkitti · November 27, 2023, 5:12pm

Have you tried calling and timing it twice to account for compilation latency?

stevengj · November 27, 2023, 5:16pm

I wouldn’t use jl_call in a low-latency situation, since (I think?) that will still do dynamic dispatch. Instead, if possible you should simple ask Julia for a C function pointer with the specific type signature that you want to call. Then the latency will be the same as for any other C function pointer.

See the paragraph on @cfunction in the Julia embedding manual.

I submitted a documentation PR to clarify this: additional clarification on cfunction embedding by stevengj · Pull Request #52315 · JuliaLang/julia · GitHub

AntoineK · November 28, 2023, 9:43am

Yes, the benchmark does a warmup call so that is does not include any timing due to JAOT compilation. The 100ns is the mean duration over a million jl_calls.

AntoineK · November 28, 2023, 9:44am

ok I will investigate the @cfunction solution. Thank you.

AntoineK · November 29, 2023, 3:59pm

By using @cfunction I am getting better performance. Thanks for the tip. Had a bit of hard time finding the right way of passing Arrays without getting heap memory allocation. Solution found by using a combination of jl_alloc_array_[1,2]d to allocate the arrays and a cfunction defined as follow:

jl_value_t *cfunc = jl_eval_string("@cfunction(gemv!, Cvoid, (Ref{Array{Cdouble,1}}, Ref{Array{Cdouble,2}}, Ref{Array{Cdouble,1}}))");

the “jl_call” curve seems a bit suspicious. Will investigate why its behavior is different.

Topic		Replies	Views
Type-checked cfunction call from C++ General Usage	0	431	December 8, 2016
Fortran calling Julia. Julia 10x slower than Fortran Performance question , fortran	15	1978	May 27, 2021
Calling Julia Functions using 'cfunction' General Usage embedding , c	0	1278	February 13, 2018
Benchmark function that uses CUDA.jl Performance cuda , benchmark , precompilation	2	709	November 7, 2021
Performance Issue with Gtk.jl General Usage	7	1436	May 31, 2018

Jl_call - function call latency

Related topics