Alternatives to Dictionaries in Julia

I’ve been slowly porting my Java and Python code to Julia and frequently come across statements like “dictionaries are less commonly used in Julia” 1

I catch myself using dictionaries all the time in Julia, as direct replacement for (many) maps in Java and Python dictionaries. You get the advantage of a set for the key, easy way to filter down on values while being able to link each value to its key.

Am I doing this wrong? Should I be replacing maps and dicts with some other data structure?

1 Like

DataFrames and NamedTuples are popular alternatives, depending on the use case. Dicts are also popular, as are Dictionaries.jl dicts.

1 Like

What kind of Dict are you using? I would begin to worry if one or both of the type parameters to Dict is frequently Any or an abstract type.

Also see

1 Like

It’s not wrong to use Dicts. They are a basic data structure used in all kinds of languages. It’s perfectly idiomatic to use Dicts in Julia where they make sense to use algorithmically
That being said, Python does use dicts way more that we would in Julia.

  • In Python, all instances of custom classes contain a dict by default, not so in Julia
  • In Python, keyword arguments are stored in a dict, whereas in Julia, they are in a NamedTuple
  • In Julia, NamedTuples can often be used instead of Dicts with benefit, if the set of keys is both small and known at compile time.
14 Likes

There’s nothing wrong with dictionaries per se, but depending on how you use them other alternatives may be relatively more attractive in Julia.

  • Named tuples are conveniently available. They have some tradeoffs though in being immutable and heterogeneously typed. Particularly useful when you want to pass around small amounts (number of fields) of read-only data.

  • Custom types (struct and mutable struct) are low overhead to create and more practical than dictionaries if you want to take advantage of dispatch.

  • Non-concretely typed dictionaries can cause inference failures, which can be a significant concern in Julia.

8 Likes

My typical use case is to take a list of limited size like the list of stocks owned in a portfolio Vector{Security}, then for each one get the share price and shares outstanding. Now you have Dict{Security, Tuple(Float, BigInt}}. Then you can filter, do additional processing, and ultimately return a dictionary of stock and transactions Dict{Security, Order}.

Does this make sense?

Offtopic: you need a BigInt for a count of shares? It seems like Int64 should always suffice.

9 Likes

It sounds like tabular data to me. As others have mentioned, a DataFrame might be suitable for that.

You would have a DataFrame of all stocks in all portfolios. You can then group, filter, aggregate, join, … This should sound familiar if you have experience with relational databases and SQL.

But anyway… there is nothing wrong with dictionaries!

1 Like

If you need access by key, and this is the most common access pattern, then dictionaries is the natural structure to use!
Otherwise, you may want to look at simple arrays, like data = [(security=Security(...), price=..., shares=...), (security=Security(...), price=..., shares=...), ...]. It’s basically a lightweight table, and you can use all kinds of common tabular operations, as well as functions from the vast Julia array ecosystem.
Converting such a struct to a dict for fast lookup is also easy: prices = Dict(r.security => r.price for r in data).

2 Likes

Maybe this was already said, but I didn’t see it skimming the other comments, so I’d just add that it’s pretty easy to end up with a Dict{Any, Any} which really messes up type inference (and therefore performance). If it’s possible to annotate the types at the instantiation of the Dict like d =Dict{Symbol, Int}(), it will help ward off some of the problems that the “don’t use Dict” advice is trying to prevent.

Edit: oops sorry @GunnarFarneback

1 Like

We can all dream :smile:

9 Likes