This works well with small arrays but if I have an larger array e.g. 10000x10000 this could take many hours. Is there a way to do this faster?
Thanks in advance, Bjoern
Using @btime to time functions over large arrays is not difficult. Remember to prefix the arguments in the function call with $, this tells BenchmarkTools not to treat their presence in the function call as a something of specific timing interest (it is the time required by however they are used within the function that matters).
using BenchmarkTools
# toi see the tiing info and the returned values
@btime elcount6!($blobs, $blobs_occ)
# to see the timing info only (suppressing the values)
@btime elcount6!($blobs, $blobs_occ);
@JeffreySarnoff, thanks for your time and answer, but it issues error messages:
function elcount6!(blobs, blobs_occ)
@inbounds for j in axes(blobs,2), i in axes(blobs,1)
blobs[i,j] = blobs_occ[blobs[i,j]];
end
return blobs
end
blobs = rand(0:9,10_000,10_000);
blobs_occ = countmap(vec(blobs))
@btime elcount6!($blobs, $blobs_occ)
julia> @btime elcount6!($blobs, $blobs_occ)
ERROR: KeyError: key 10002175 not found
Stacktrace:
[1] getindex at .\dict.jl:467 [inlined]
...
Some setup seems to be required to @btime in-place modifying functions but I did not figure out yet how to do it for the function above.
Indeed a fresh blob variable is required for every evaluation, you can do it like this:
function elcount!(blobs, blobs_occ)
@inbounds for i in eachindex(blobs)
blobs[i] = blobs_occ[blobs[i]];
end
return blobs
end
blobs = rand(0:9,10_000,10_000);
blobs_occ = countmap(vec(blobs));
@btime elcount!(b, $blobs_occ) setup=(b=copy(blobs)) evals=1;
Isn’t that beautiful? It’s expected since the dictionary of counts is created outside of the benchmarked code. The code really has nothing to do except finding and assigning existing values.