Is something like reversed Dict? findall(x -> x=="house", cc) is to slow:/

programista · October 17, 2018, 2:20pm

Dict can find one val by key
> julia> D=Dict("a"=>1,"b"=>1,"c"=>2)
> Dict{String,Int64} with 3 entries:
>   "c" => 2
>   "b" => 1
>   "a" => 1
> 
> julia> get(D,"a",false)
> 1

I need something fast in oposite side:
somethink like findforme(1,D)
output is [“a”,“b”]
findall is to slow
findall(x → x==“house”, cc)
my collect cc has more then 10^ values and any serch during over 2 sek
Some hints?
Paul

ExpandingMan · October 17, 2018, 2:27pm

I seem to recall that a while ago some of us discussed adding a bidirectional dict to DataStructures.jl but no one ever did so. Might still be nice to have.

Fortunately, in Julia it’s extremely easy to create the reverse dictionary, behold!

dict = Dict(rand(Int, 10^5) .=> rand(Int, 10^5))
rdict = Dict(values(dict) .=> keys(dict))

Once it is created you can look up in constant time to your heart’s content.

I just realized that you have repeat values. This is more complicated, and the way of dealing with it will depend on what you are trying to achieve.

bennedich · October 17, 2018, 2:42pm

Like ExpandingMan says above, the way to solve this is to create a separate dictionary with keys and values reversed. In your case, since you have repeated values, it sounds like you want to map ints to either an array or a set of strings (depending on if you need them ordered and how you’ll access them).

programista · October 17, 2018, 3:14pm

Maybe Dict of tuples? This way can be fast ?

> julia> D=Dict(1=>("a", "b",), 2=> ("c",), 3=>("a","b","d",))
> Dict{Int64,Tuple{String,Vararg{String,N} where N}} with 3 entries:
>   2 => ("c",)
>   3 => ("a", "b", "d")
>   1 => ("a", "b")
> julia> get(D,1,"")
> ("a", "b")
> 
> 
> julia> for i in get(D,2,"")
>        println(i)
>        end
> c
> 
> julia> for i in get(D,1,"")
>        println(i)
>        end
> a
> b

Paul

StefanKarpinski · October 17, 2018, 3:24pm

For the millionth time, can you PLEASE quote your code?

jandehaan · October 17, 2018, 4:04pm

julia> D=Dict(1=>(“a”, “b”,), 2=> (“c”,), 3=>(“a”,“b”,“d”,))
Dict{Int64,Tuple{String,Vararg{String,N} where N}} with 3 entries:
2 => (“c”,)
3 => (“a”, “b”, “d”)
1 => (“a”, “b”)

julia> get(D,1,"")
(“a”, “b”)

julia> for i in get(D,2,"")
               println(i)
         end
c

julia> for i in get(D,1,"")
              println(i)
         end
a
b

bennedich · October 17, 2018, 4:21pm

Tuples seem like a poor choice since they’re immutable and IMO not a good fit for this problem. How would you construct such a dictionary programmatically? As I said, I think an array or a set is a better choice.

Read up on data structures and read the sample code in this link: Collections and Data Structures · The Julia Language

programista · October 17, 2018, 4:29pm

I can remove tuples , Arrays seems more slower …
Paul

W dniu 2018-10-17 o 18:26, Max Bennedich pisze:

bennedich · October 18, 2018, 6:53am

Do you have a benchmark showing this?

By all means use tuples if they work for you, but I suspect that you’ll find them harder to work with.

programista · October 18, 2018, 2:40pm

I did it, is very usefull.
serchng time ± 0.000015 seconds in Array was 2 sec. !!

@time get(replslow,"ptaszek","")
   0.000015 seconds (4 allocations: 160 bytes)
("PTASZKOWI", "PTASZKOWI", "ptaszka", "Ptaszek")

replslow
Dict{String,Tuple} with 371432 entries: more then 13*10^6 words

Juan · October 18, 2018, 3:41pm

But you created the reversed dictionary by hand.
How would you you make it programatically from the original dictionary?

bennedich · October 18, 2018, 3:43pm

No, you didn’t understand what I wrote. A dictionary mapping strings to an array or a set of strings.

If what you’ve implemented solves your problem, then by all means go for it, but as you can see in your example, and the examples you removed with your edit, your tuples are full of duplicates, which should not be possible given the problem description in your OP (since you can’t have duplicate keys in a dictionary). It suggests that either your problem is not as described in your OP, your implementation is incorrect, and/or a set would be a better choice than tuples (which removes duplicates).

jandehaan · October 18, 2018, 7:20pm

I merely quoted Paul @programista’s code (surrounded it by ``` characters) to make it easier to read.
One possible answer to your question was given by @ExpandingMan

dict = Dict(rand(Int, 10^5) .=> rand(Int, 10^5))
rdict = Dict(values(dict) .=> keys(dict))

programista · October 18, 2018, 7:40pm

THX , it was only , of course no duplicateson righ side . Thank for Your hint about maping !
Paul

ExpandingMan · October 18, 2018, 7:49pm

Here’s a not particularly performant example of a way to deal with the non-injectiveness problem:

function revdict(dict::AbstractDict{K,V}) where {K,V}
    o = Dict{V,Vector{K}}()
    for (k, v) ∈ dict
        v ∈ keys(o) ? push!(o[v], k) : (o[v] = [k])
    end
    o
end

(I originally used sizehint! but that’s probably stupid since it’s using dynamically allocated arrays anyway.)

Tamas_Papp · October 19, 2018, 6:15am

Using get! (foreach is orthogonal, for is fine):

function revdict2(dict::AbstractDict{K,V}) where {K,V}
    o = Dict{V,Vector{K}}()
    foreach(((k, v),) -> push!(get!(() -> Vector{K}(), o, v), k), dict)
    o
end

ExpandingMan · October 19, 2018, 1:24pm

Julia gives you so many ways to write things in as few lines as possible, it’s always a fun challenge to see how few lines you can write your function in!

Come to think of it, this is a reason why broadcasting is one of my favorite features of Julia. It gives you so many more options for writing succinct but also perfectly comprehensible expressions (granted it’s not really applicable here because we’re not dealing with arrays).

Although, no offense @Tamas_Papp but your second line here confuses the hell out of me .

Tamas_Papp · October 19, 2018, 1:36pm

Too much time spent working in Common Lisp, I guess

The loop version would be

for (k, v) in dict
    push!(get!(() -> Vector{K}(), o, v), k)
end

get! is a particularly handy accessor for accumulating in Dicts, inserting the result of the function (first argument, creates a Vector{K}) if the key is not found.

ExpandingMan · October 19, 2018, 1:45pm

Incidentally, it was this part that I had to stare at for a few minutes to understand what it was doing., not the foreach (though the combination of course takes a bit longer to parse).

On a related note, is there some reason that the function argument to get! doesn’t take the key as an argument? That seems like a missed opportunity.

Tamas_Papp · October 20, 2018, 5:43am

On the contrary, I think the purpose of get! is to deal with the same key twice if necessary, but save the additional lookup.

Topic		Replies	Views
Dictionary - inverse and consolidating New to Julia question , dictionary	9	1429	August 11, 2022
Performance of Dictionaries.jl vs Base.Dict Internals & Design package , performance , dictionary , dictionaries	32	3226	December 13, 2024
Collect values in a dict New to Julia	10	599	July 16, 2021
Invertable Dict structure? General Usage	5	1194	November 21, 2017
Fast Dict value modification by accessing Dict.vals Performance	16	3651	March 14, 2019

Is something like reversed Dict? findall(x -> x=="house", cc) is to slow:/

Related topics