When to use a Dictionary versus a Struct in Julia?

00krishna · May 5, 2020, 7:00pm

I have been doing my best to exorcise some bad habits from the python world, like the common use of Dictionaries. As I understand Julia, Structs and namedtuples seem to perform a lot better because of the julia type system. I think these data structures also seem a bit more Julian, since there is less ambiguity about the data types on variables.

However, I was looking at the DrWatson project on Julia Dynamics, and noticed that they use Dictionaries to store parameter values and such. So I was trying to figure out some rules of thumb for when to use Dictionaries versus when to use Structs. This question partly boils down to when does using a Dictionary not impair other optimizations in the code–and how would you know if those optimizations were impaired?

Any suggestions on when to use Dictionaries in the “right” way–meaning no impairment in the performance of the rest of your code?

Oh yes, and I don’t mean to disparage DrWatson, which seems like a great package. It just made me think about using Dictionaries, but otherwise the developers of that package know a lot more about Julia programming than I do.

pixel27 · May 5, 2020, 7:07pm

I believe the recommendation is to use Dictionaries when you don’t know the set of keys ahead of time. Basically when writing the code if you know what the keys are going to be then use a structure or a NamedTuple. If the key names are going to be based on some input then then a dictionary would be the way to go.

mauro3 · May 5, 2020, 7:24pm

There is also a potential trade-off between compile time and runtime performance. If you have many sets of different parameters and store them in a named tuple, all methods will be specialized on each different named tuple (i.e. they will be re-compiled). Whereas for a Dict, this will not be the case. However, once compiled named tuples are likely faster. Note also that there are different kinds of dicts which have different kind of performance characteristics, e.g. LittleDict OrderedCollections.jl/little_dict.jl at master · JuliaCollections/OrderedCollections.jl · GitHub

00krishna · May 6, 2020, 4:26pm

@jakobnissen also posted these tips on Slack. I asked him if he was okay with me reposting these comments, and he said he was fine with it.

There are a few issues at hand here.

First, Dicts are slow. “Slow” here may be 1 microsecond for read/write operations. For many user-facing applications this doesn’t matter, but you probably shouldn’t have any internals relying on Dicts, because they then become impossible to optimize (looking at you, Python!)
Second, they’re memory inefficient. Same story as the previous point
Crucially, they are mutable. That makes it hard for both the programmer and the compiler to figure out exactly what kind of data they contain at a given point. With a named tuple, you know for sure which fields it contains at all times. I think this is the most important aspect: Dicts are hard to reason about

Topic		Replies	Views
Dictionaries vs Mutable structs New to Julia	2	1560	January 19, 2019
Alternatives to Dictionaries in Julia New to Julia dictionaries	10	1680	February 20, 2023
When should I choose a struct, mutable struct, Dict, named tuple or DataFrame? General Usage	17	8118	August 27, 2021
What is the fastest data type out of these? General Usage question	18	1240	October 1, 2018
Performance of Dict Performance	1	956	November 12, 2019

When to use a Dictionary versus a Struct in Julia?

Related topics