What is the fastest data type out of these?

What’s the faster data types out of:

  • Dict
  • Set

Is there any data type there that is faster than both?
I would really like to know, thank you.

As Julia is mostly written in Julia, it’s pretty easy to understand what they do:
https://github.com/JuliaLang/julia/blob/master/base/dict.jl and https://github.com/JuliaLang/julia/blob/master/base/set.jl.

Concerning the performance, best is to check in a setting mimicking your application using BenchmarkTools.jl.

1 Like

The dataset is dynamic and I worry about performance when it comes to large scale datasets.
So I pretty much want to know which is faster query-wise

Internally Sets in Julia are implemented using Dicts (only keys and no values). Both use hashing for indexing.

1 Like

So in theory Dicts should complete a query faster?

I would assume that they offer the same performance. What do you actually mean by query?

1 Like

For example

a = Dict("key" => "value")
return a["key"]

With a["key"] on a Dict you’ll get the value for the specified key. With a Set you can only store a key. They are not the same data structure.

1 Like

Oh okay I understand, thank you, I guess Dictionary is the way to go

What about named tuples tho?

I wouldn’t worry about performance in the beginning. Do what you want to do, and if it’s too slow, people in this forum will certainly help you.

2 Likes

Thank you ,
I will just have large datasets and I need to choose if to store them and choose what data type to choose
All I need to do is being able to change values by keys and add values by keys

That’s it, so it looks like both Dict and NamedTuples are cable of these only Dict, so I was wondering what’s faster, looking the post you’ve linked it looks like NamedTuples is way faster.

EDIT: I take back what I said, NamedTuples are immutable, I will go with Dict

No it’s not

But you can’t change there values according to the post Lanorg linked.

Named tuples are a good replacement for Symbol-keyed Dicts if (1) the named tuple is tiny, and (2) the hashing can happen at compile-time.

In all other cases, they have terrible performance compared to a Dict. In other words, use them as syntactic sugar to avoid an enum that maps human-readable names to tuple-indices.

Alright, thank you for the clarification

Tuples and NamedTuples are of course immutable, eg

julia> isimmutable((a = 1, b =2))
true
1 Like

Thanks for confirming that. So my uses of them have all been unknowingly in the immutable cases then.

Just benchmark them in your particular application.

3 Likes