I'm trying to write something with the best possible performance (for a library). I keep wishing I was writing C! is that normal? Should I just write C?


Thanks for that. I sort of knew I should be sending something other than NULL to the char **endptr parameter and checking it when I was writing the function, but I didn’t do anything about it. I’ll fix it. Thanks for again for spotting it and mentioning it. (OK, I implemented it, and man is it awkward to deal with char** using ccall, but at least it works.)

I’ll look at seeing what I can do with the SIMD in Julia first (great idea, by the way). Writing ASM or LLVM IR or ASM-like C is currently above my paygrade.


See also this thread on writing fast string processing code by operating in chunks, including both SIMD and non-SIMD examples: https://github.com/JuliaLang/julia/pull/30400 … using these techniques is sometimes harder if you don’t assume valid UTF-8 (isvalid strings), however.

(You can work on 64-bit chunks without SIMD and get quite a speedup over byte-by-byte processing, but not as much as with AVX instructions on 512-bit chunks.)


Didn’t we have an “unsafe”, allocation-free, pointer-based string type in some package? I forgot where I saw it, but maybe something like that would enable more Julianic code here (via AbstractString and substrings instead of low-level buffer operations), without high-frequency memory allocation?


Very interesting reading. I’ll have to dig into this more over the weekend.

Interesting thought. I don’t see how your going to get out of allocations, though, if you’re creating a lot of new substring instances (even if they just contain pointers), but maybe the JIT can make that go away in the right circumstances.

I can comprehend how to write C-like Julia now that is fast, but I still have no idea what LLVM and the compiler are doing behind the scenes, so I’m not sure what optimizations I can depend on. (On the other hand, I have no clue how GCC optimizes things either, so maybe I’m in a similar place with C anyway).


Because for a pointer-based (“unsafe”) string type (has to be used with care!), the instances would be immutable structs that are stack-allocated. So GC will never know about them. If wrote a packages that does this for arrays (UnsafeArrays.jl), and I think I saw the same for Strings somewhere, but I forgot in which package.


Side track: How do I know when Julia is allocating something on the stack or the heap? This topic is extremely relevant to my interests.


In general, mutable structs and instances of Array are heap-allocated. Primitive data types and immutable structs that are free from references to heap-allocated values are stack-allocated.

You can use isbitstype , isbits and Base.isbitsunion to check if instances of types will be stack-allocated.

Disclaimer: This may be a bit simplified, I seem to remember that @yuyichao once wrote here that isbits and stack-allocated isn’t strictly the same thing, technically.


They are not the same practically.


When @time or @btime or julia --track-allocations reports “allocations”, those refer to heap-allocations.

Constructing an instance of an isbitstype struct can be done without any heap allocation, while non-isbitstype instances may require heap allocation. This is why, for example, you can construct SVector{3, Int} or similar from StaticArrays.jl all day long without incurring any heap allocation.


@yuyichao, I feared you might say something like that … :wink:

That’s why I mentioned you, I had a feeling that I wasn’t entirely correct. But in regard to @ninjaaron’s question: If isbitstype returns true, then instances of the type will at least usually be stack-allocated, correct?

I’d also like to learn a bit more about the deeper issues here (e.g. why and when isbits and stack-allocation may not the be same, even practically). Is there any source you can recommend to read more on this?


No. But if it’s type stable then mostly yes.

This (stack allocation) is completely in the land of compiler optimization so it’s mostly about what information the compiler have access to and what information it is able to use. It’s a moving target so there won’t be a stable document about it. I doubt there will be very detailed implementation document other than code comments either since given what the compiler knows now, it’s actually easier to understand what the compiler should do based on the intention of the code…


What I think I’m hearing is that objects which can be statically analysed to have compile-time guarantees about size should generally be allocated on the stack, and check to make sure in places where it counts (and check again between updates)


No. Basically all objects have static size at allocation site. Array and String are the only exceptions other than other extremely special types like Task, Module etc. Escape (and in general usage) analysis is what matters.

Not sure what this means…


Oh, sure, I had assumed type stability - sorry, @ninjaaron, should have mentioned that. But I guess your code was already type stable, right?


I try my best! (in this case, yes)


I meant check between releases to make sure your not getting more allocations that before (c.f. moving target)