Operating on many arrays of the same size: one loop, many loops, or broadcasting?

In general, writing a single loop will be the fastest and most flexible way to do things, and is probably the easiest to code as well if you are doing lots of operations like this. (In a function — don’t write nontrivial code in global scope!)

(The problem with trying to do array operations like A .= B .+ D followed by C .= A .* D is that each array operation will be a separate pass over the arrays, even though Julia fuses the loops within a single operation like A .= B .+ D. Doing separate loops, either manually or via broadcasting, is suboptimal both because of poor cache utilization and because you repeat the loop-index computations multiple times.)

Even for multiple arrays, you should try to loop through data sequentially in memory (i.e. column-major for Julia), because sequential memory access is better for your computer’s cache. (This has nothing to do with compilers.) For the same reason may get even better performance if you use an array of data structures that contain (a,b,c,d,e) rather than 5 separate arrays, since in general if you compute on (a,b,c,d,e) together you want to store them nearby in memory.

5 Likes