Creating the same matrix in Julia as in Matlab takes longer time

Perhaps my question was not clear: suppose I

garble!(dont_change_this_array)

then why should it matter if the argument is a SubArray or any other <: AbstractArray? Why is it more of a problem for @views?

No. Depending on how you want to implement it, it can either be as expensive as @view, pretty useless for unsafe ops and make literally everything else much slower (if you want to implement it at a higher level) or it could be very heavy weight on creation, not much cheaper than copying except for really large array (if you want to implement it at a lower level).

I didn’t say that AFAICT, it is the most light weight option (for creating at least).

Or in another word, as long as the user know what he’s doing, COW is never the best solution no matter how you implement it. It is only useful if we expect the opposite to be the case and even then it’ll be slower than copying most of the time.

Here we generally prefer easier to understand/predict semantics and letting user to do minimum work to gain maximum performance, rather than trying to make sure everyone get non-worstcase performance but no one got max performance.

4 Likes

I’m afraid you have lost me completely… I don’t think I am saying it’s more of a problem for views, or am I?

But what’s the relevance of this to the choice between copies, views, and copy-on-write?

I responded to the following:

If you perform an operation that overwrites data, then you can accidentally overwrite data. This is not specific to @view or SubArrays: it is an orthogonal problem, and so are its solutions.

OK, I assumed it would be as cheap or expensive to create as @views; I thought it would only differ when writing occurred. It would have be able to detect when the original array is modified, I suppose, which complicates things.

So performance, really, kills it.

I didn’t say it was, nor do I see the relevance. I wasn’t suggesting it as a way to solve the problem of overwriting data in general. My point was that with copy-on-write you could get the advantages of slices-as-views without the pitfalls and the massive breakage.

Now, it seems that may not be practically achievable, but I like the idea in principle.

I think I’ve come up with a better way of expressing my view of the perceived advantage of COW:

You would keep the current semantics, slices are copies. But COW would be an under-the-hood optimization that avoided allocations where possible.

If this “optimization” doesn’t lead to better performance, then there’s no point, of course.

I was fascinated by the idea of COW until I started to think how to actually implement it. Basically, COW means that each write operation should be prepended by a few instructions checking if the copy has already been created. I.e. every call to e.g. setindex! will become slower. It might be ok in a single-threaded environment, but the array may be used in several threads, so COW will also invalidate CPU caching and other optimizations. So most likely COW will introduce an average performance loss for everyone while trying to optimize worst case for some people.

Another option might be to statically analyze functions during compilation and automatically rewrite copies to views when the compiler can prove that no write operations are performed. This is closely related to the topic of pure functions - highly desirable, but again very non-trivial to implement feature.

2 Likes