Can Julia achieve fine grained control of performance without sacrificing ease of use?

Tarny_GG_Channie · June 16, 2023, 8:12am

Julia was designed to have a good overall performance, being llvm-compiled with a good type inference.
However, Julia has some subtle issues when fine-grained control over the inner working is needed, for example…
-What if you use a data that can either be a pointer or not a pointer? (Using some bits to indicate whether it’s a pointer or not.)
-What if runtime dynamic dispatch really is needed?
-What about fixed-size mutable arrays?
-What about mutable objects living in the stack?
-Memory models needed to use atomic ops for lock-free data structures?
-Etc
Some of them are being worked on.
The point is that, Julia is fast in general, but there are still subtle features needed to implement certain things with maximum performance.
This implies that even if hypothetically, Julia could be as fast as C++ for code with the same semantic, Julia would still lose to C++ because it misses out on opportunities to optimize the code.
Do you think it would be solved in the future?
Or would it be too much to ask? Even C/C++ often need to refer to assembly sometimes. Julia can do that too if needed, but it would be better if Julia has features for fine-grained control over performance, but can it do that without sacrificing ease of use?

sgaure · June 16, 2023, 8:44am

I’ve come across some of these, atomics works reasonably well now, I think. The dynamic dispatch could be improved, at least the documentation. The canonical example is a vector of geometric objects, e.g. circle, triangle, square, etc. Or, in the case I came across, a vector of transformations. Each of them a subtype of an AbstractTransformation.

I’m not sure about the internals now, but at the time of compiling a function taking such an argument, there exists a fixed number of subtypes, so it is possible to enumerate them and have a simple and fast lookup for dispatch on the elements of the vector. If there are more containers of abstract type it could be complicated. Likewise with complicated parametric types. Perhaps it’s possible to annotate arguments that should be treated in this way? It could of course lead to problems if further subtypes are created later, with invalidations of such functions?

For what I know, something like this is already being done, but it is not documented in the performance tips.

Benny · June 16, 2023, 8:44am

Simultaneously, this is a bit too broad and you have too many different examples, but broad strokes, a high-level language like Julia will never have the same level of control as C++. Ease of use and safety comes from leaving things to a compiler and garbage collector, there’s far fewer user mistakes that can get in the way.

Mutable objects on the stack is a good example of something that is very hard to do safely, and the compiler does it as well as most people. Bear in mind that mutable here means data shared by multiple variables, not just changing data at a location in the stackframe, which can be done with variable reassignments to immutable instances.

As for the features you listed, you don’t really need close-to-the-metal control to pull some of these off:

Not really sure what data that “can either be a pointer or not a pointer” means exactly, I’ve never heard of a data structure like that, but if you mean holding data inline a struct at one field versus a Ref at another, yeah you can make structs like that in Julia, I’ve seen it done to implement sometimes-inline arrays.
You can cause runtime dispatch pretty easily with type instability. If you can do static dispatch I see no reason to opt into runtime dispatch on purpose.
StaticArrays.MArray is a fixed-size mutable array

Sukera · June 16, 2023, 9:57am

That is not true - you can do pointer shenanigans with Ptr and bit fiddling in Julia just as well as in any other “low level” language. You just don’t get any support from the Julia runtime for managing that and you’re on your own - but then again, neither does C, and some perceive the added functionality/safety that C++ provides as “bloat” (which I disagre with).

I do that often, and I don’t find it more cumbersome than inline assembly in C.

Do you have some concrete example you’re thinking of?

Tarny_GG_Channie · June 16, 2023, 10:48am

What I meant is that it would be better if it can implement some low-level stuffs without referring to assembly every time. Some examples would be an array whose size is known at run-time, not compile-time, but does not change after it has been constructed. Having this primitive array type provides a basis for many data structures without needing mechanisms to expand the array and so on.

Benny · June 16, 2023, 11:10am

There’s some low-level features interfacing with C, but it’s not all features, like mutables on the stack or static variables.

Every Array except a Vector is like this.

CameronBieganek · June 16, 2023, 11:48am

How do you use assembly in Julia? (Honest question.)

Sukera · June 16, 2023, 11:56am

Frequently enough that it’s very convenient to abstract away and not have to think about it While the inline assembly in inline LLVM-IR that I use is a bit rarer than just inline LLVM-IR, if you use SIMD.jl, pretty much every use of that uses inline LLVM-IR.

The general pattern is something like

llvmcall("""
call void asm sideeffect "<asm goes here>", ""()
ret void
""",
Nothing,
argtypetuple...,
args...)

i.e., using llvmcall to have LLVM-IR, which then does the actual assembly call inline in its IR.

I really wouldn’t recommend doing that outside of very narrow circumstances (say, writing a “disable interrupts” function for specific use on AVR microcontrollers) because the assembly is (obviously) architecture specific and not portable at all. Still, it’s very possible to do

gbaraldi · June 16, 2023, 11:56am

llvmcall is the easiest way. For most cases llvm ir is enough, but from llvm ir you can do inline assembly as well.

Sukera · June 16, 2023, 11:59am

I’m still not quite sure I follow - nothing you’ve mentioned requires writing assembly at all. Do you mean specializing the resulting code on the size of that array…?

lmiq · June 16, 2023, 12:00pm

This will be always true to some level, not only in Julia. You can find many (exhaustive) performance comparisons of Julia with other languages (and among other languages) in which the differences in performance end up being at the level of specific compiler flags, which intrinsic math function is being called, and so on. Generally these things are only required in very localized portions of the code, and the overall facility to implement good algorithms is much more important for the performance as a whole.

Where I think Julia seems to be somewhat slower than C++ (specifically) is when dynamic dispatch is required. From what I’ve seen here trying to match C++ performance in this case can be cumbersome, and I don’t remember having a standard go-to solution.

Large stack allocated arrays, for example, would be (will be?) a nice addition to the language, but one can get over that rather easily with preallocation.

CameronBieganek · June 16, 2023, 12:10pm

There is a recent proposal from Jeff to add a memory buffer type which can be used for defining array types in Julia. I don’t have a link handy.

AMJ · June 16, 2023, 12:39pm

gist.github.com

https://gist.github.com/JeffBezanson/a25dde3bebb5a734af87bb5ddcf31fb0

buffer.md

# Julep: Redesigning `Array` using a new lower-level container type

## History and motivation

From its inception, Julia has had an `Array` type intended to be used any time you need
"a bunch of things together in row". This is one of the most important and most unusual
types in the language. When `Array` was designed, it was believed (possibly by multiple people,
but Jeff is willing to take the blame) that the easiest-to-use interface would be a
single array type that does most of what users need. Many Julia programmers now feel that
the language and community have outgrown this trade-off, and problems are emerging:

This file has been truncated. show original

frylock · June 16, 2023, 12:43pm

You use Julia on AVR micros!? (or was that a hypothetical example?) I’m impressed, Julia seems sort of large to run on something like that - but then again my only real experience with AVR is Arduino.

Sukera · June 16, 2023, 1:31pm

Well, not all of Julia - the whole compiler, runtime & task scheduling obviously doesn’t fit on microcontrollers It’s just a small subset and there are lots of difficulties, which will be explored later this year in a juliacon talk

Topic		Replies	Views
Programming Language Benchmark 2 Performance	25	3417	April 8, 2024
When Julia gets within 1-3x of C/C++ speed, why is C/C++ usually faster? Performance	11	2596	November 2, 2020
What do we have to attract C++ or other languages' power users to try Julia? Community offtopic	31	949	August 12, 2025
Is my understanding of Julia correct? New to Julia question	38	4382	March 8, 2022
Are there any fundamental parts of Julia's design or implementation that could limit its performance? Performance	6	845	June 13, 2019

Can Julia achieve fine grained control of performance without sacrificing ease of use?

Related topics