Tried to play with map, reduce, mapreduce, list comprehension, dictionary creation, but to no avail…
Please note that I want to do it in one efficient pass rather than iterating by each possible value in arr and going over array many times.
Thanks!
julia> d = Dict{Int, Vector{Int}}()
Dict{Int64, Vector{Int64}}()
julia> for (k, v) in p
c = get!(d, v, Int[]) # get array for key, or create if absent
push!(c, k) # push new index to array
end
julia> d
Dict{Int64, Vector{Int64}} with 8 entries:
4 => [4]
6 => [1]
7 => [8]
2 => [7]
9 => [2, 3]
8 => [9]
3 => [10]
1 => [5, 6]
The implementation looks very similar to the solution from above:
function group(groups, values)
I = eltype(groups) # TODO EltypeUnknown
T = eltype(values) # TODO EltypeUnknown
out = Dictionary{I, Vector{T}}()
@inbounds for (group, value) in zip(groups, values)
tmp = get!(() -> T[value], out, group)
last(tmp) == value || push!(tmp, value)
end
return out
end
Why do you think it would be faster for large array?
First note that you have shared the code of an incorrect method for group, but this is a minor issue.
The implementation is similar but not identical and it is faster, here is an example:
julia> using SplitApplyCombine
julia> using BenchmarkTools
julia> arr = rand(1:1000, 10^6);
julia> function sol(arr)
d = Dict{Int, Vector{Int}}()
for (k, v) in pairs(arr)
c = get!(d, v, Int[])
push!(c, k)
end
return d
end
sol (generic function with 1 method)
julia> @benchmark sol($arr)
BenchmarkTools.Trial: 84 samples with 1 evaluation.
Range (min … max): 51.139 ms … 79.473 ms ┊ GC (min … max): 5.99% … 5.12%
Time (median): 59.537 ms ┊ GC (median): 6.03%
Time (mean ± σ): 59.924 ms ± 4.510 ms ┊ GC (mean ± σ): 5.82% ± 1.20%
▁ █
▃▁▁▁▃▃▃▃▁▃▆▅▃▆▃▃▃▃▃█▆▅▆█▆██▆▃▅▃▃▅▆▁▃▅▅▅▁▃▃▁▃▁▃▃▁▁▁▁▁▁▁▃▅▁▁▃ ▁
51.1 ms Histogram: frequency by time 70.8 ms <
Memory estimate: 82.41 MiB, allocs estimate: 1005018.
julia> @benchmark group($arr, eachindex($arr))
BenchmarkTools.Trial: 158 samples with 1 evaluation.
Range (min … max): 23.603 ms … 42.635 ms ┊ GC (min … max): 0.00% … 6.79%
Time (median): 31.774 ms ┊ GC (median): 0.00%
Time (mean ± σ): 31.785 ms ± 3.384 ms ┊ GC (mean ± σ): 4.44% ± 4.96%
▂ ▂ ▂▁ ▁█▂▄ ▅▂ ▄▂▄▁▁▂ ▁
▃▁▁▁▅▁▃▃▅▃▅▅▃███▃█▃██▆████▅█████████▆█▅▆▃▃▁▁▁▃▃▅▃▁▁▁▁▁▁▁▁▃▅ ▃
23.6 ms Histogram: frequency by time 41.5 ms <
Memory estimate: 21.45 MiB, allocs estimate: 6024.
To fix the problem with sol one needs to make creation of initial Int[] vector lazy and only done when needed:
julia> function sol2(arr)
d = Dict{Int, Vector{Int}}()
for (k, v) in pairs(arr)
c = get!(() -> Int[], d, v)
push!(c, k)
end
return d
end
sol2 (generic function with 1 method)
julia> @benchmark sol2($arr)
BenchmarkTools.Trial: 188 samples with 1 evaluation.
Range (min … max): 22.640 ms … 37.099 ms ┊ GC (min … max): 0.00% … 3.39%
Time (median): 26.503 ms ┊ GC (median): 0.00%
Time (mean ± σ): 26.695 ms ± 2.338 ms ┊ GC (mean ± σ): 1.84% ± 1.88%
▃▂ ▂ ▇ ▂ ▃▃▃█▂ █
▃▃▁▁▄▅█▇██▆█▅█▆▅█▅█████▅██▆▇█▆▅▁▁▃▁▃▄▅▅▁▁▁▁▄▃▃▁▁▄▃▁▁▁▄▁▁▃▃▃ ▃
22.6 ms Histogram: frequency by time 33.8 ms <
Memory estimate: 21.44 MiB, allocs estimate: 6018.
Also I guess eachindex(arr) is lazy and possibly faster than pairs(arr) which I used, right?
I copied one of the implementations, the others with some more explicit types are similar enough in structure.
Did I miss any important implementation tricks hidden there besides what you mentioned with lazy initialization?
This should be equivalent - compiler should optimize this out.
No, this is the most important thing. Everything else, should be similar (that is why I have written that the fact that you shared wrong method code does not matter much). In some cases it might matter that SplitApplyCombine.jl uses Dictionary (which is a custom dictionary) and not Dict (which is standard), but in this test this does not affect the performance.