Why breaking down a function into multiple functions speed up runtime

Hi,

1/ Can someone explain why in Julia when I breakdown various part of main function into multiple functions, that main function become faster?

2/ Also the number of allocation is (sometimes massively) reduced, why is that? There are still allocations being performed inside these small functions which are used to build a bigger function…

In the Julia documentation I only found the following :slight_smile: “Any code that is performance critical should be inside a function. Code inside functions tends to run much faster than top level code, due to how Julia’s compiler works.”

3/ Can someone explain how the Julia compiler works when it comes to optimising functions?
Maybe that can be added to the Julia documentation (in case it is not already in there and I missed it)

Thank you

1 Like

this mainly happens when your code isn’t type stable. splitting your code up can help because if there is a type instability at the beginning and it’s all in 1 function, anything that uses that variable will be of unknown type, but if you instead got a function, there will be a dynamic dispatch and the compiler will after that point know what the types are again. using @code_warntype will show you what the type instability is.

5 Likes

This is the corresponding section: Performance Tips · The Julia Language

2 Likes

Thank you both for the link and explanations. What about the allocations? Why are they reduced? Does Julia only measure the allocations in the main function?

For the same reason: you’re reducing type-instabilities (which can introduce extra memory allocations)

No.

2 Likes

allocations were reduced for the same reason. type unstable code needs to box and unbox values which allocates.

1 Like

In other words: if you have a vector of things for which you don’t know the type, the vector will be a vector of references to objects that have to be allocated independently in the memory. If the all the objects are of the same type, then the allocation will be of one big chunk of memory with the objects stored in packed pile in that chunk.

In the first, a lot of independent allocations may occur, one for each object of the vector. In the second, only the chunk has to be allocated, at once. But for this, the exact size (memory wise) of the objects must be known, which implies knowing their types and they having concrete types.

2 Likes

Thank you