Is something like reversed Dict? findall(x -> x=="house", cc) is to slow:/


#1
Dict can find one val by key
> julia> D=Dict("a"=>1,"b"=>1,"c"=>2)
> Dict{String,Int64} with 3 entries:
>   "c" => 2
>   "b" => 1
>   "a" => 1
> 
> julia> get(D,"a",false)
> 1

I need something fast in oposite side:
somethink like findforme(1,D)
output is [“a”,“b”]
findall is to slow
findall(x -> x==“house”, cc)
my collect cc has more then 10^ values and any serch during over 2 sek :confused:
Some hints?
Paul


#2

I seem to recall that a while ago some of us discussed adding a bidirectional dict to DataStructures.jl but no one ever did so. Might still be nice to have.

Fortunately, in Julia it’s extremely easy to create the reverse dictionary, behold!

dict = Dict(rand(Int, 10^5) .=> rand(Int, 10^5))
rdict = Dict(values(dict) .=> keys(dict))

Once it is created you can look up in constant time to your heart’s content.

I just realized that you have repeat values. This is more complicated, and the way of dealing with it will depend on what you are trying to achieve.


#3

Like ExpandingMan says above, the way to solve this is to create a separate dictionary with keys and values reversed. In your case, since you have repeated values, it sounds like you want to map ints to either an array or a set of strings (depending on if you need them ordered and how you’ll access them).


#4

Maybe Dict of tuples? This way can be fast ?

> julia> D=Dict(1=>("a", "b",), 2=> ("c",), 3=>("a","b","d",))
> Dict{Int64,Tuple{String,Vararg{String,N} where N}} with 3 entries:
>   2 => ("c",)
>   3 => ("a", "b", "d")
>   1 => ("a", "b")
> julia> get(D,1,"")
> ("a", "b")
> 
> 
> julia> for i in get(D,2,"")
>        println(i)
>        end
> c
> 
> julia> for i in get(D,1,"")
>        println(i)
>        end
> a
> b

Paul


#5

For the millionth time, can you PLEASE quote your code?


Position of plots on the screen, Plots.jl ,gr()
#6
julia> D=Dict(1=>(“a”, “b”,), 2=> (“c”,), 3=>(“a”,“b”,“d”,))
Dict{Int64,Tuple{String,Vararg{String,N} where N}} with 3 entries:
2 => (“c”,)
3 => (“a”, “b”, “d”)
1 => (“a”, “b”)

julia> get(D,1,"")
(“a”, “b”)

julia> for i in get(D,2,"")
               println(i)
         end
c

julia> for i in get(D,1,"")
              println(i)
         end
a
b

#7

Tuples seem like a poor choice since they’re immutable and IMO not a good fit for this problem. How would you construct such a dictionary programmatically? As I said, I think an array or a set is a better choice.

Read up on data structures and read the sample code in this link: https://docs.julialang.org/en/v1.0/base/collections/


#8

I can remove tuples , Arrays seems more slower …
Paul

W dniu 2018-10-17 o 18:26, Max Bennedich pisze:


#9

Do you have a benchmark showing this?

By all means use tuples if they work for you, but I suspect that you’ll find them harder to work with.


#10

I did it, is very usefull.
serchng time ± 0.000015 seconds in Array was 2 sec. !!

@time get(replslow,"ptaszek","")
   0.000015 seconds (4 allocations: 160 bytes)
("PTASZKOWI", "PTASZKOWI", "ptaszka", "Ptaszek")

replslow
Dict{String,Tuple} with 371432 entries: more then 13*10^6 words

#11

But you created the reversed dictionary by hand.
How would you you make it programatically from the original dictionary?


#12

No, you didn’t understand what I wrote. A dictionary mapping strings to an array or a set of strings.

If what you’ve implemented solves your problem, then by all means go for it, but as you can see in your example, and the examples you removed with your edit, your tuples are full of duplicates, which should not be possible given the problem description in your OP (since you can’t have duplicate keys in a dictionary). It suggests that either your problem is not as described in your OP, your implementation is incorrect, and/or a set would be a better choice than tuples (which removes duplicates).


#13

I merely quoted Paul @programista’s code (surrounded it by ``` characters) to make it easier to read.
One possible answer to your question was given by @ExpandingMan

dict = Dict(rand(Int, 10^5) .=> rand(Int, 10^5))
rdict = Dict(values(dict) .=> keys(dict))

#14

THX , it was only , of course no duplicateson righ side . Thank for Your hint about maping !
Paul


#15

Here’s a not particularly performant example of a way to deal with the non-injectiveness problem:

function revdict(dict::AbstractDict{K,V}) where {K,V}
    o = Dict{V,Vector{K}}()
    for (k, v) ∈ dict
        v ∈ keys(o) ? push!(o[v], k) : (o[v] = [k])
    end
    o
end

(I originally used sizehint! but that’s probably stupid since it’s using dynamically allocated arrays anyway.)


#16

Using get! (foreach is orthogonal, for is fine):

function revdict2(dict::AbstractDict{K,V}) where {K,V}
    o = Dict{V,Vector{K}}()
    foreach(((k, v),) -> push!(get!(() -> Vector{K}(), o, v), k), dict)
    o
end

#17

Julia gives you so many ways to write things in as few lines as possible, it’s always a fun challenge to see how few lines you can write your function in! :smile:

Come to think of it, this is a reason why broadcasting is one of my favorite features of Julia. It gives you so many more options for writing succinct but also perfectly comprehensible expressions (granted it’s not really applicable here because we’re not dealing with arrays).

Although, no offense @Tamas_Papp but your second line here confuses the hell out of me :laughing:.


#18

Too much time spent working in Common Lisp, I guess :wink:

The loop version would be

for (k, v) in dict
    push!(get!(() -> Vector{K}(), o, v), k)
end

get! is a particularly handy accessor for accumulating in Dicts, inserting the result of the function (first argument, creates a Vector{K}) if the key is not found.


#19

Incidentally, it was this part that I had to stare at for a few minutes to understand what it was doing., not the foreach (though the combination of course takes a bit longer to parse).

On a related note, is there some reason that the function argument to get! doesn’t take the key as an argument? That seems like a missed opportunity.


#20

On the contrary, I think the purpose of get! is to deal with the same key twice if necessary, but save the additional lookup.