hi i’m trying to normalize(in the db sense) a dictionary{Int8, Vecctor{Float}} but the code never ends
d = Dict()
for i in 1:100
c = rand(Int8)
n = rand()
a = get!(d, c, [])
push!(a, n)
end
map(collect, zip(Iterators.flatten([[(k, i) for i in v] for (k, v) in d])...))
i want to convert something like 1 => [1,2,3], 2=>[3] to something like [(1, 1), (1,2), (1,3), (2,3)] and then [1,1,1,2], [1,2,3,3]
You are passing 100 collections to the zip function, so this is the bottleneck of the code.
You can use this function to accomplish what you want:
function fill_key_val(dict)
n = sum(length, values(dict))
key_vector = Array{keytype(dict)}(undef, n)
val_vector = Array{eltype(valtype(dict))}(undef, n)
ind = 1
@inbounds for (k, v) = dict
for i = v
key_vector[ind] = k
val_vector[ind] = i
ind += 1
end
end
key_vector, val_vector
end
Note that you can specify concrete types for the keys and values of the dictionary by Dict{Int8, Vector{Float64}}(), so the compiler can generate specialized code.
Actually, what I meant to say is that the bottleneck is the argument of the zip function itself. You are creating a Tuple with 100 elements, so the compiler struggles with type inference. This problem occurs specifically when you try to iterate the following Generator:
iter = (collect(k) for k in zip(Iterators.flatten([[(k, i) for i in v] for (k, v) in d])...))
iterate(iter)
Are you sure? If you replace for i in 1:100 in your original code by for i in 1:5, it should run in acceptable time. Of course it is not optimized, but the code should run.
map dispatches on a specialized method for the case of an AbstractArray, which is the result of collect.