Performance of array of arrays vs. dictionaries when accessing and manipulating data structures

I here have two different structs that represent the same data. The main objective is to achieve as fast as possible performance.

#Using array of arrays representation
struct Route_array

#Using dictionaries to represent the data
struct Route_dict

Which of these formulations would be optimal if you have to access and update the values of structs more than say ~10^10 times? The dictionary version would be the preferable one since it would make the following code easier to write, but I am concerned about losing performance.

EDIT: The data structures comes from the an optimization problem where the goal is to find the optimal way to deliver fuel. The route structs that you see is the foundation of a solution to this problem. (The structs here have been significantly reduced, to better understand the difference between the two formulations. In the original struct there are 8 members). A complete solution to my instance of the problem will contain around 3000 of these route structs. These routes are then nested inside another structs (Called Solution) which together with the routes also contain the objective value (given in km) and the current level of fuel in all tanks and all stations.

From here the goal is to improve the solution by, for example, creating or deleting route structs. It could also be by making smaller changes to an existence route struct, this could be done by changing the order in which the stations and tanks are visited or how much fuel (amount) are put in each of these. Again, there are 8 structs members in the original struct, so what you see have been significantly reduced. But essentially, I access the route struct many, many times in which a function adds something to the struct members or change some of the values in the struct. Calculating these changes can involve some operations the scope of which depends to how big the changes are. For example, when creating a new route struct I need to do “a lot” of calculations (Determine amount to each tank, determine if there is an vehicle available, making sure that constraints are not violated, etc…).

1 Like

Naively (without knowing how these structs are used, which is crucial), the former should be faster because there’s no hashing or rehashing involved compared to a Dict. Practically, it depends on how you use them - if the access is even somewhat rare, any real computation done on them will matter more than the time difference in accessing these.

EDIT: Though a slight performance upgrade due to caching effects may be achieved if you have a Matrix instead of a Vector{Vector{Int}} - the former is stored in contiguous memory, while the latter is disjointed and scattered, leading to more cache misses.


If you have a contiguous array of anything, I can’t think of any reason to ever prefer using a Dict{Int, X} over a Vector{X}. The Dict layout is more complicated, so lookup and modification will be slower.

If you don’t have a contiguous array, then a Dict might make sense, although there are other choices to consider like a sparse vector.

This seems like the most relevant part. A Vector seems like the obvious answer to the question you’ve posed, but I bet the actual answer will depend a lot on what you mean here.


Not without reason someone called his book data structures and algorithms = programs. Maybe you should show us something more about what you trying to achieve with your data structures…

However, the two are not equivalent, a Vector{Vector{Int}} may have each inner row/column of different length, the Matrix solution needs to support either Nothing/Missing to do the same and may waste a lot of memory.


Yeah, sparsity maybe a key element here (and even then there a probably alternate solutions).

Edit: sorry for inappropriate comment.

1 Like