Array views becoming dominant source of memory allocation


I would be really interested in understanding how you implemented that getindex!


A very simple approach would be to split the SubArray type into a bitstype part (call it semi_view) and a reference part (called parent).

Then you change the signature of all functions including views to take one argument more: A parent and a semi_view. Because semi_view is bitstype there are no allocations.

This would be kinda similar to the token/semi-token API used for the trees in datastructures.jl.


Thanks for all the replies.

In my case, because the slices are often small, the allocations really slow down the performance. I might try something like the Unsafe VectorView type.

I had a look for open issues and I didn’t find this topic. Did I miss it or should I file something?


Don’t do it!!!

If you need the performance now, just pass the two parts separately.

There are already enough issues about allocation optimization.


Thanks yuyichao. Are you referring to foobar_lv’s approach?


Roughly split the SubArray type into a bitstype part (call it semi_view) and a reference part (called parent). yes. Definately not the other parts.


I completely agree with yoyichao. Using pointers to save subarray allocations is an amusing and educational hack, but a terrible idea in practice. I just did the pointer thing to get a non-allocating view in 10 lines of code, and learn more about julia on the way.

Whether split array views are worth the effort is up to you (the resulting API will suck and you need to spend the time to write a split_view module).

Depending on your specific usecase, it might be easier to just write off array views as unusable in performance-critical sections, at least until the language improves. You can always just write the loops verbatim; this is what I am doing at the moment.

The resulting code doesn’t look as nice and is longer, but you will quickly find additional optimization opportunities once array accesses are source-code inlined (move a multiply for index calculations one loop-level up, improve memory access patterns, sprinkle @simd and @fastmath, the resulting code_native becomes more recognizable and it becomes easier to spot performance problems in it).

Edit: If you want a one-word fix (literally!), you could try making views mutable and then reuse them. This prevents compiler-optimizations based on the immutability assumption, but I am not sure whether these are actually important for array views. This is something you would need to benchmark. I think that this should not change the data layout, i.e. cache-friendlyness and level-of-indirection should not be affected (but this needs to be tested).


Simply defining your own mutable array view so you can increment or otherwise change the offset instead of defining new views, as in my example above, worked well and struck me as reasonable.


Is this going to be fixed by 1.0, or is it more likely to be 1.x?


I certainly hope so!


I had a quick try with this my original post on Julia 0.7.0-DEV nightly windows 64 bit. It seems the performance limitation from my original post still remains.

Is this to be expected? If not, are there open issues that would address this problem?


You can try

UnsafeArrays.jl provides allocation-free arrays views, with some limitations. Most importantly, as @yuyichao already noted, pointer-based array access has to be used with great care and deliberation. In my experience, even frequent allocation of views doesn’t usually have a serious impact on performance in single-threaded use. With heavy multi-threading, though, view allocation time can currently become a limiting factor.