So if you’re familiar with the dist
class from R, it is very similar.
It represents a matrix, or rather the lower half of a matrix underneath the diagonal (as the matrix is assumed to be symmetrical with 0
values as the diagonal, there is no point in storing both halfs), and so each row has a different number of cells.
Row one is one cell, row two is two cells, three has three and so on:
as.dist(matrix(rexp(100, rate=.1), ncol=10))
1 2 3 4 5 6
2 30.96809401
3 4.29517367 25.35477096
4 11.05042388 47.12602515 9.05635823
5 24.80794737 15.96819158 5.57638651 10.38488874
6 3.73206725 26.75391206 8.32492987 8.59412784 0.35015252
7 33.00415806 0.20160700 2.03088283 0.45323016 13.87089156 22.28731310
8 9.10296273 7.98441481 1.90013604 8.93347893 7.18713469 7.02904816
9 3.74577306 3.19449860 5.11027106 1.63343495 14.50446960 1.38334442
10 24.93290636 2.64181538 1.29009316 0.04882485 3.75045984 19.01684707
Looking at the structure in R, you can see the data is laid out linearly:
> str(as.dist(matrix(rexp(100, rate=.1), ncol=10)))
Class 'dist' atomic [1:45] 0.441 4.219 7.834 0.657 0.112 ...
..- attr(*, "Size")= int 10
..- attr(*, "call")= language as.dist.default(m = matrix(rexp(100, rate = 0.1), ncol = 10))
..- attr(*, "Diag")= logi FALSE
..- attr(*, "Upper")= logi FALSE
And the array support indexing linearly i.e. m[43]
but also by row/column m[4, 5]
.