Blog post: How to optimise Julia code: A practical guide

jakobnissen · June 9, 2022, 7:34am

Hello all!

I made a new blog post on how to optimise Julia code. Any comments are welcome!

https://viralinstruction.com/posts/optimise/

maxkapur · June 9, 2022, 12:15pm

Thank you for this interesting and informative post. FYI, there is a very small typo in the data locality section, where you define the struct

struct FooArray
 as::Vector{Int32}
 bs::Vector{Int16}
end

In context, it should be bs::Vector{UInt16}.

I have a question about data locality principles but I will make a separate thread and @ you rather than clutter up this one.

patrick · June 9, 2022, 4:43pm

This is fantastic! Thank you for writing it.

haberdashPI · June 9, 2022, 5:12pm

Great to see this topic getting some love. Thanks for your time and energy.

I’m surprised you recommend reviewing algorithms as 3rd on your list. It can easily contain the lowest hanging fruit. If you can easily go from O(N^3) to O(N log N) with a change to one top-level function, making your code type stable might be unnecessary. Of course, YMMV.

fortunewalla · June 10, 2022, 2:27am

I am wondering if you would consider making a copy of your article over to forem.julialang.org. That is an official Julia forum is meant for Julia specific long form content such as this and has built-in SEO for original links and would give it more traction and not to mention be of great benefit to the community. Thanks

maxkapur · June 10, 2022, 3:33am

I wonder if this editorial choice can be explained by the desire to avoid scope creep. Improving your algorithm is a great way to optimize code in any language, but the subject at hand is how to optimize Julia code. For people like me who have a high-level understanding of algorithms and computational complexity but are just bad at coding, this kind of tutorial is a wonderful resource.

jakobnissen · June 10, 2022, 8:01pm

The reason choosing the right algorithms is number 3 is that the first two are more important.

The first one is type stability. I consider this as much an aspect of code quality as about performance. If possible, one should type stabilize your code. And if one isn’t sure if the code is as type stable as it ought to be, one has no business optimising the code.

The second one is profiling. It doesn’t matter if some part of your code is O(n^3), if your function call spends 0.01% runtime there. Making it O(n log(n)) or whatever will make no practical difference, only potentially cause issues. Only optimise what matters.

Elrod · June 10, 2022, 11:48pm

A good place to start is to look for vectorisation. If you believe the code should vectorise, scan the assembly for the presence of vector instructions, which can be identified in x86 assembly by usually beginning with “vp”.

You only do integer operations?

Also, Cthulhu.jl is excellent. It is much better than @code_warntype, @code_typed, @code_native, and @code_llvm. You are grossly mischaracterizing it IMO.

haberdashPI · June 11, 2022, 12:53am

Though I believe I agree with the spirit of your point, I think this goes a bit far.

Ways I agree:

1.) I’m open to the possibility that for a new comer to Julia, the emphasis should be more on type stability

2.) There are absolutely cases where it makes more sense to start with type stability, but I think that should be judged on a case-by-case basis.

Rather than having a fixed order, I’d argue one should be looking for the lowest hanging fruit and track the effects of said changes empirically. If it would take 5-mintues to switch algorithms, and 60-minutes to improve type stability, why not try the 5-minute fix and see if it helps enough that you don’t have to bother optimizing further?

It doesn’t matter if some part of your code is O(n^3), if your function call spends 0.01% runtime there.

Agreed, but sometimes formal profiling of the code would take longer than just switching algorithms and seeing if it helps or not.

brenhinkeller · June 11, 2022, 2:07am

Great post!

As an anecdote, I will say that understanding type stability (and consequences of multiple dispatch more generally) was for me the key thing I was missing to make all the pieces fall into place when I was new to the language, and the main difference maker between loving the language and leaving in frustration. At this point it’s in the first week of my “intro to computation for Earth sciences” course I think. So happy to see it emphasized here!

I’ll also echo Chris that Cthulhu.jl is actually not nearly as scary as the name might suggest – I only started using it in the past month or so, but I now default to it pretty much every time over @code_warntype / @code_typed / @code_llvm / @code_native

jules · June 11, 2022, 6:47am

When debugging performance issues in Makie I often have the problem that there are many type instabilities but those can neither be removed nor do they have to matter for performance. But they do make using tools like JET or Cthulhu harder because at every dynamic dispatch they give up. I’d actually need a dynamic debugger in conjunction with these tools, but so far the debuggers we have were either much too slow or crashed or were difficult to understand / erratic in the way they jumped around places in the code when stepping.

Tamas_Papp · June 11, 2022, 8:57am

I am not sure about this — with tooling like

I find profiling very convenient.

In any case, I agree both with you and @jakobnissen to some extent: algorithmic improvements are great if you can obtain them, but that’s not always possible and sometimes requires a bit if creativity. And, of course, it is difficult to write a concise general guide about doing this.

OTOH, fixing up type stability problems and profiling is a reasonably mechanical process that is worth learning about.

A small comment about the post: I find the built in memory-allocation profiling impractical in Julia, and always end up resorting to allocation analysis with

carstenbauer · June 11, 2022, 2:36pm

Dito, but I haven’t tried the new memory profiler yet. Have you?

DNF · June 11, 2022, 4:56pm

What’s the new profiler?

ericphanson · June 11, 2022, 5:41pm

https://docs.julialang.org/en/v1.9-dev/manual/profile/#Allocation-Profiler

It’s in Julia 1.8

brenhinkeller · June 11, 2022, 5:42pm

TIL, awesome! Sounds like a big upgrade over the old version that wrote a zillion files you had to find and read and clean up.

ericphanson · June 11, 2022, 5:46pm

My understanding is it was an awesome contribution by RelationalAI: https://github.com/JuliaLang/julia/pull/42768. They’ve also got other really interesting PRs like https://github.com/JuliaLang/julia/pull/42286.

brenhinkeller · June 11, 2022, 6:05pm

Woah, awesome! Hope that gets merged soon too!

rafael.guerra · June 11, 2022, 6:48pm

May I ask why in the Use multiple threads section of the blog post, the package Folds.jl and the broader JuliaFolds ecosystem are recommended, but not also the popular LoopVectorization.jl package and the JuliaSIMD organization? Thank you, from a newbie.

brenhinkeller · June 11, 2022, 8:05pm

Huh, I hadn’t noticed that at first. I can certainly recommend LoopVectorization.jl from my own experience! I guess at first LoopVectorization was only singlthreaded, but @tturbo (which multithreads via Polyester.jl) is IMHO one of the easiest ways to get really performant multithreading on (reorderable) loops.

Topic		Replies	Views
Looking for Some Best Practices for Optimizing Julia Code Performance? General Usage	5	474	December 23, 2024
Quality of Julia code and speed - is this stressed enough? Teaching & Outreach	37	8705	April 30, 2019
Suggestions for a good example of a realistic type instability for Julia demo? General Usage	10	1056	April 17, 2019
Why type instability? General Usage question	24	10559	November 23, 2017
TL;DR tips for Julia users coming from Matlab General Usage	28	2494	October 11, 2021

Blog post: How to optimise Julia code: A practical guide

Related topics