Fastest way to turn table into Array?

Hi all.

One if the main parts of my daily workflow makes me turn large tables to matrixes and or n-dims-arrays.

For example:

For a X Y Z Property table with 1 million+ lines, I do:


Zi=[410 471 660 760 821 921 1021 1121 1221 1321 1421 1521 1621 1721 1821 1921 2021 2121 2221 2321 2421 2521 2621 2740 2771 2889];

dRho=readdlm("ALL_dRho",Float64,comments=true,comment_char='#'); #input file


for a=1:size(Long,1)
    for b=1:size(Lat,1)
        for c=1:size(Zi,2)
            println("$a $b $c")
            indx=findall( (dRho[:,2].==Lat[b]) .& (dRho[:,1].==Long[a]) .& (dRho[:,3].==Zi[c]) );
            @inbounds dRhoM[a,b,c]=dRho[indx[1],4];

Is there any way to make this allocation faster? It takes hours as it is!

If anyone wants to try, the file is here:

GitHub - marianoarnaiz/JULIA

have your read the performance tips section of the manual? there are 2 things you could do that will make this much faster.

Luckily, the data is sorted the right way for this to work…

dRhoM = reshape(dRho[:,4], length(Long), length(Lat), length(Zi));

Other than that, the print statement is a major source of slowing down. The loops are in the wrong order too. Remember, Julia lays out arrays in column-major-order.

It looks not

dRhoM = reshape(dRho[:,4], 1:length(Long), eachindex(Lat), eachindex(Zi));
ERROR: MethodError: no method matching reshape(::Vector{Float64}, ::Tuple{UnitRange{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}})
1 Like

Ah, you are right. I think I was mentally in the wrong thread :smile: . I have edited the statement.


It looks not

no trouble here…

julia> dRhoM = reshape(dRho[:,4], 1:length(Long), eachindex(Lat), eachindex(Zi));


Then I realized that dRhoM is an OffsetArray though I never imported that package. Must have extended Base.reshape when loading a package that depends on it. That’s logical but a bit nasty.

outch, looks like that OffsetArray is indeed a dangerous beast.