Hi all.
One if the main parts of my daily workflow makes me turn large tables to matrixes and or n-dims-arrays.
For example:
For a X Y Z Property table with 1 million+ lines, I do:
Lat=-90:1:90
Long=-180:1:180
Zi=[410 471 660 760 821 921 1021 1121 1221 1321 1421 1521 1621 1721 1821 1921 2021 2121 2221 2321 2421 2521 2621 2740 2771 2889];
dRho=readdlm("ALL_dRho",Float64,comments=true,comment_char='#'); #input file
dRhoM=zeros(size(Long,1),size(Lat,1),size(Zi,2));
for a=1:size(Long,1)
for b=1:size(Lat,1)
for c=1:size(Zi,2)
println("$a $b $c")
indx=findall( (dRho[:,2].==Lat[b]) .& (dRho[:,1].==Long[a]) .& (dRho[:,3].==Zi[c]) );
@inbounds dRhoM[a,b,c]=dRho[indx[1],4];
end
end
end
Is there any way to make this allocation faster? It takes hours as it is!
If anyone wants to try, the file ALL_dRho.zip is here:
GitHub - marianoarnaiz/JULIA
have your read the performance tips section of the manual? there are 2 things you could do that will make this much faster.
Luckily, the data is sorted the right way for this to work…
dRhoM = reshape(dRho[:,4], length(Long), length(Lat), length(Zi));
Other than that, the print statement is a major source of slowing down. The loops are in the wrong order too. Remember, Julia lays out arrays in column-major-order.
It looks not
dRhoM = reshape(dRho[:,4], 1:length(Long), eachindex(Lat), eachindex(Zi));
ERROR: MethodError: no method matching reshape(::Vector{Float64}, ::Tuple{UnitRange{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}})
1 Like
Ah, you are right. I think I was mentally in the wrong thread . I have edited the statement.
Yet
It looks not
no trouble here…
julia> dRhoM = reshape(dRho[:,4], 1:length(Long), eachindex(Lat), eachindex(Zi));
julia>
Then I realized that dRhoM
is an OffsetArray
though I never imported that package. Must have extended Base.reshape
when loading a package that depends on it. That’s logical but a bit nasty.
outch, looks like that OffsetArray
is indeed a dangerous beast.