I have to deal with a vector (really a list) of data points.
Often, this can be written as Matrix{Float64}
of size (D,N)
, where each data-point lives in D-dimensional space and I have N of these guys (D much smaller than N). Now, the code I’m writing does not really need to look into the data points.
Nevertheless, code that operates on data-points needs to always loop / broadcast / view / etc over all the D dimensions, which makes code hard to read and hard to write (do views get stack-allocated? Are some closures causing stuff to be boxed? do I write @simd for d=1:D
everywhere in hand-inline everything? Do I play nasty pointer-games?)
Likewise, not all collections of data-points want to be a dense matrices; who knows what kind of data-points will be useful.
So, really, my code should operate on Vector{point_T} where {point_T}
. Then, all this pain goes away: Each data-point is simply an object, stored in a vector, and code that does not need to look inside the objects becomes very clear.
And if point_T = SVector{D, Float64}
then this is really the same as the matrix (it is literally the same collection of bits). Just that I get specialization on D
, which is awesome! And if future-me or future users (post packaging) want to play with more fancy objects, well, the code is now generic.
Unfortunately, there is a problem (there always is a problem): If D
is large, then SVector{D, Float64}
emits the most disgusting code in the world, because it has no loops. If D=1000
, well, there goes my instruction cache.
What do you people do in these cases? And no, Vector{Vector{Float64}}
is not the answer: Firstly, I want immutable stack-allocated semantics (but could work around this), and secondly, I spent effort to make the memory access patterns cache-friendly (spatial locality). Vomiting my data points all over the heap does not sound attractive.
One extraordinarily dirty variant is to have a way of viewing a matrix as fake_vector_of_vector
, such that V[k]
returns a pointer into the matrix, together with length information in a nice immutable isbits
that is an AbstractVector
. Basically move the creation of unsafe pointer-based views out of my generic code, and into my fake_vector_of_vector
type. Is this the recommended way? Any better ideas?
Obviously I would then have to manage gc visibility by hand, and hope that type based aliasing analysis does not kill my code in the future (I think Keno wanted to make it too smart for such dirty tricks? As far as I understood, this lead to the bemoaned reinterpret
-rewrite for arrays and accompanying performance regressions, with hoped speed-up in non-dirty code).
Or is there a plan to fix large SVector
s to emit the code that should be emitted, i.e. stop to fully unroll loops?
edit: To clarify: For small D
, Vector{SVector}
is perfect. For very large D
, Vector{Vector}
is perfect. I don’t know how to write my code for intermediate D
.