How to best iterate over k-dimensional collection, e.g. (n,k) matrix?

I have a dataset with n rows and k columns I want to iterate over. What is the recommended way?

Currently I would first create a n x k Matrix and then do two nested for-loops. I heard column wise is best

for col in 1:k
   for row in 1:n

I heard so that functions like eachindex() and objects like CartesianIndex exist and I’d love to understand how to use them better. Eachindex loses the row and col information. CartesianIndex I was not able to properly use.

(1) Which collection to choose. Is Matrix okay?
(2) Are there more simple ways to iterate over a collection while maintaining (a) the row and col indices, (b) the matrix index, (c) both
(3) What are Cartesian Indices and how/when to use them?

These are the most idiomatic ways, I think:

julia> a = [ 1 2 3; 4 5 6 ]
2×3 Matrix{Int64}:
 1  2  3
 4  5  6

julia> for i in axes(a,2), j in axes(a,1)
           @show j, i, a[j,i]
(j, i, a[j, i]) = (1, 1, 1)
(j, i, a[j, i]) = (2, 1, 4)
(j, i, a[j, i]) = (1, 2, 2)
(j, i, a[j, i]) = (2, 2, 5)
(j, i, a[j, i]) = (1, 3, 3)
(j, i, a[j, i]) = (2, 3, 6)

julia> for c in CartesianIndices(a)
           @show c[1], c[2], a[c]
(c[1], c[2], a[c]) = (1, 1, 1)
(c[1], c[2], a[c]) = (2, 1, 4)
(c[1], c[2], a[c]) = (1, 2, 2)
(c[1], c[2], a[c]) = (2, 2, 5)
(c[1], c[2], a[c]) = (1, 3, 3)
(c[1], c[2], a[c]) = (2, 3, 6)

There is nothing wrong with your double loop, just be sure that the indexes are inbounds (something that is guaranteed by these alternatives)

ps: a matrix is definitely ok. Alternatively you may want to look at DataFrames package and companions, if you want something more sophisticated in terms of data manipulation.


Other possible options can be found here (with bonus benchmarking, although they are probably dated):

See also: OrdinalIndexing.jl