Best way to store subsets of a sample

I think the problem is, that in principle you want to operate on both, the rows (subsamples) and colums (computation).

So, the best dataformat would be one which gives easy access to both of these things. If you cannot use other packages, how about a simple matrix for all your data?
Here, the first column stored the id. This allows you to easy carry the id with you throughout the computations.

X = [5 13; 16 12; 16 26; 17 15; 18 14; 23 6; 25 10; 27 22; 37 14; 42 25; 5 17];
Y = [12; 14; 25; 26; 8; 9; 27; 30; 31; 26; 12];
data = [1:n X Y]

A = data[1:2:end,:]
B = data[2:2:end,:]

result_A = [A[:,1] A[:,2:3] .* 2]
result_B = [B[:,1] B[:,2:3] .* 2]

result = vcat(result_A,result_B)
sort!(result, dims=1)
result

2 Likes