Matrix storage

Enlil50 · January 27, 2024, 4:55pm

How are matrices stored?

Let us make an example:

I initialize a 10*10 int32 matrix.

I first fill the 5*5 left-top block.

I then extend the matrix to be a 10*10, adding the new elements with the correct [ , ] syntax.

Is the matrix allocated as a single array, thus requiring re-sorting data, or is it allocated as 10 different contiguous arrays, thus requiring no re-sorting?

I mean. If it’s all an array, In the first step the 6-th element is the 1st element of the 2nd column. While in the second step it is still in the 1st column.

nsajko · January 27, 2024, 5:09pm

More precisely, each Array is backed by a contiguous piece of memory. And Matrix <: Array.

I don’t think you can extend a matrix like that. You can create a new one though. And there’s resize!, but it doesn’t seem to support non-Vector Arrays. EDIT: reshape may also be relevant in some cases.

Enlil50 · January 27, 2024, 5:28pm

Sorry, I asked for an non-extintent issue (it’s late here). Reading it again it’s not a real issue.

The elements are already allocated, so there is no issue.

Benny · January 27, 2024, 5:32pm

Julia Arrays are column-major, in other words iterating contiguous elements is iterating along leftmost dimensions.

Enlil50 initialized a 10x10 to begin with, they really meant filling the rest. And yes the only Array you can extend beyond its initialized size is the 1D Vector. This is mainly a pragmatic limitation, ElasticArrays.jl provides a >1D array type that is extended along its rightmost dimension.

Enlil50 · January 27, 2024, 5:58pm

Regarding this topic: is it possible to define 3 different arrays placed (in Ram or cache or what it is) one near the other.

For example:
Array (row): ▢ ▢ ▢ ▢…
These elements are consecutive.

Matrices are just arrays.

I want 3 arrays, but I want to use them together each time (for example a sparse matrix is 3 arrays in the compressed format).

So:
▢ ▢ ▢…
▢ ▢ ▢…
▢ ▢ ▢…

But not a matrix, as the third array is much smaller than the previous 2.

I could build a matrix, reallocate it as a 3*len(longest array) but that would allocate a lot of unused space.

I could just write 3 arrays.

But if I know when I use one, I would like to use the other 2 ones as well, there should be something optimizable(?)

P.S: I could build one array of length the sum of the 3, but I don’t think it’s really what I’m looking for, as these arrays are really long and elements of these different arrays would end up being really far away, one to the other.

Benny · January 27, 2024, 6:08pm

Not in general for separate arrays that can have different element types and shapes. Heap allocation anywhere handles separate instances separately, otherwise in your case Vectors could not ever be cheaply extended unless it’s the last one in the multi-array block. This also wouldn’t optimize anything, in fact bigger blocks miss the cache more easily.

What does this mean, how is it distinct from just using 3 arrays, possibly referenced by 1 container or struct instance?

Elements within the same array are going to be far away from each other, at best you can “put elements from different arrays together” by making 1 array with a composite element type holding 1 element from each array. But you can’t do that because you already said the 3rd array is much smaller, so it’ll run out in the middle of initialization.

nsajko · January 27, 2024, 6:09pm

Maybe have one Array and three views into it?

Enlil50 · January 27, 2024, 6:36pm

If the struct places arrays (you define in it) one near the other at the best he can do (up until now I never looked in depth in it) it’s OK. But I wonder if it’s the best it can do for really long arrays. I can’t even split them.

Enlil50 · January 27, 2024, 6:37pm

I dunno if it’s faster to reference to the second array, or to compute i+N and reference to the first (I need to benchmark them).

Benny · January 27, 2024, 6:43pm

They don’t do that the way you’re saying, they just store the arrays’ identifying addresses next to each other, that’s what referencing meant. The arrays’ elements are still allocated separately.

You’ll have to explain the optimizations you have in mind that would be helped by putting these 3 arrays together in some way. Say you store 3 arrays in a tuple as an example of a container, D = (A, B, C), D[1][indices] is already fast. If they all have the same element type and don’t need to change shape, going by njasko’s idea D = zeros(element_type, 12); A = view(D, 1:5); B = view(D, 6:10); C = view(D, 11:12), A[indices] is already fast. You can use reshape if you want those views to work like anything other than vectors.

nsajko · January 27, 2024, 6:48pm

Just to be clear, not sure if you know already: Julia already comes with functionality such as view, @view and @views. See Views (SubArrays and other view types), and search for “view” or “subarray” here: Single- and multi-dimensional Arrays · The Julia Language

Basically, if you have an array a, and say you want to index it with, e.g., 5:7, if you just do a[5:7] you’ll be creating a copy, but if you do @view a[5:7], you’ll create a view into a, which may be of different type (i.e., not Array).

Enlil50 · January 27, 2024, 6:59pm

I will try it these day, as sonn as possible. Thanks for the ideas. I need to come with something concrete.

Topic		Replies	Views
What are the pros and cons of row/column major ordering? Internals & Design arrays , column-major , row-major	10	3173	February 13, 2024
Resize!(matrix) General Usage question	44	11028	April 22, 2018
Would matrix multiplication benefit if Julia allowed for row-major arrays as well? Internals & Design question , proposal , column-major	1	908	September 20, 2018
Hard Disk storage solution for a (large) array of matrices Performance memory-allocation , data_structures , struct , type-stability , formatting	12	440	June 1, 2023
Effect of Column Major Storage on Multiplication Performance	3	446	November 7, 2020

Matrix storage

Related topics