Unsafe functions performance

Liso · December 19, 2017, 9:32am

I was trying to understand sizeof(s::String) functions to help @xiaodai to improve performance…

I was experimenting with undocumented hidden “len” field of String (it was present in Julia 0.6 and seems to be lost in Julia 0.7.0):

julia> sizof(a) = unsafe_load(Base.unsafe_convert(Ptr{UInt}, pointer(a)-8));

Benchmark results are impressive:

julia> @btime sizeof("abc")
  0.018 ns (0 allocations: 0 bytes)
3

julia> @btime sizof("abc")
  1.740 ns (0 allocations: 0 bytes)
0x0000000000000003

Why are unsafe functions 100 (!) times slower in this test?

maleadt · December 19, 2017, 9:51am

Strange, even after adding alignment information (using Base.pointerref(..., 1, 8)) which then results in identical LLVM and native code, the performance discrepancy remains.

EDIT: on 0.6, both implementations are equally slow (ie. same as the slow time from OP).

kristoffer.carlsson · December 19, 2017, 10:30am

Probably some interaction with the testing framework? IPO and all that.

ScottPJones · December 19, 2017, 10:44am

I think there’s something off with your testing, because the code generated (at least on master) is identical.

One, when sizeof(str) does exactly what you need here, and is generic, why do you want to peek at the internals?
Also, why you are calling Base.unsafe_convert, which is for converting something, when you really just need to reinterpret the pointer, i.e. reinterpret(Ptr{UInt}, pointer(a)-8)?

unsafe_convert(T, x)

Convert x to a C argument of type T where the input x must be the return value of cconvert(T, …).

maleadt · December 19, 2017, 11:02am

Yeah, 0.018 ns is pretty unrealistic of a measurement even for a simple pointer load. Trying to bisect now.

Yeah, no. I can reproduce it perfectly on master, so there’s probably something off with the testing infrastructure itself.

ScottPJones · December 19, 2017, 11:28am

That’s why I like to look at raw numbers from time_ns() for benchmarking very small things!

maleadt · December 19, 2017, 11:39am

In this case time_ns does indeed show consistent results (of ~19ns, but that’s to be expected since BenchmarkTools does multiple evals/sample). However, I’d advise against recommending it, because BenchmarkTools protects against so many other common pitfalls that are common with newcomers. @btime is a vastly better tool.

I’ve bisected the issue to 1669d532de7434108f1092f34361166737706ba5 from #24362, confirming @kristoffer.carlsson’s hunch

ScottPJones · December 19, 2017, 12:27pm

I wasn’t intending to recommend it for novice users - in my case though, I’ve had 30+ years of extensive benchmarking experience, and for that reason I like to get all of the raw data and munge it myself (which Julia makes much nicer / easier than in any other language I’ve worked on before! )

Good hunch!!!

Topic		Replies	Views
Use of `pointer` General Usage	19	6946	February 6, 2018
Unsafe_store! sometimes slower than arrays' setindex!? Performance	15	1440	November 30, 2017
Is there a way to check how "far away" a pointer is from a page boundary? General Usage	20	2165	January 6, 2018
Why does `reinterpret` cause an extra allocation? General Usage	30	4155	February 24, 2018
(misintepreted and not exist) Overhead in ccall due to unsafe_convert of pointers General Usage	3	55	August 7, 2025

Unsafe functions performance

Related topics