Is Julia Array an array or linked list?

My understanding is that a data structure cannot be an array for random access and a linked list for sequential access at the same time. There must be two different data structures for dedicated purposes.

But the following result shows that Julia Array behaves both as an array and a linked list:

Array shows constant insertion time at front/end.

julia> using BenchmarkTools

julia> a = rand(10^3); @btime push!($a, 1);
  7.499 ns (0 allocations: 0 bytes)

julia> a = rand(10^6); @btime push!($a, 1);
  7.400 ns (0 allocations: 0 bytes)

julia> a = rand(10^9); @btime push!($a, 1);
  7.499 ns (0 allocations: 0 bytes)
  
julia> a = rand(10^9); @btime pushfirst!($a, 1);
  6.899 ns (0 allocations: 0 bytes)

julia> a = rand(10^6); @btime pushfirst!($a, 1);
  6.899 ns (0 allocations: 0 bytes)

julia> a = rand(10^5); @btime pushfirst!($a, 1);
  6.899 ns (0 allocations: 0 bytes)

Whereas below shows constant random access time:

julia> a = rand(10^3); @btime $a[Int(end/2)]
  4.499 ns (0 allocations: 0 bytes)
0.7178230501269474

julia> a = rand(10^6); @btime $a[Int(end/2)]
  4.499 ns (0 allocations: 0 bytes)
0.4156162488532731

How can this be possible?

Note that in Julia terminology, you are talking about Vector, which is Array{T,1} (some languages without multidimensional arrays do call vectors arrays, but not Julia).

Julia allows Vectors to be extended, and clever overallocation heuristics make it fast.

2 Likes

Thank you for the explanation. Yes, I was talking about Vector. Right, It is indeed a random access array but with extra efficiency and I just came to know that it takes a lot more time if I try inserting at a middle position, which shows it is basically an array (like in C and Java).

Note that you do not need to find out these things indirectly by benchmarking; all the data structures in Base are documented (even though for performance models you may have to dig a bit deeper).

If you need other data structures, I would recommend

2 Likes

Hi! I was also wondering about the data structure behind Julia’s Vector and found this discussion very helpful! I wasn’t able to find this info elsewhere.

I have a final question just to dissipate any shadow of doubt: as far as I know, in C++ the elements of a std::vector are allocated in a single chunk of memory and you can insert elements with constant time overhead only at the end of it (http://www.cplusplus.com/reference/vector/vector/). On the other hand, a std::deque allows insertions in constant time at the beginning and at the end but can have the elements scattered in different chunks of memory (http://www.cplusplus.com/reference/deque/deque/). Both allow for random access, and one could expect vectors to be slightly faster (I don’t know if this is noticeable though).

My question then is: is the data structure behind Vector in Julia more like a std::vector or like a std::deque?

Thanks a lot!!

8 posts were split to a new topic: 1d arrays vs 1-column matrices

I am sorry but I don’t understand the question. Generally for Vector{T} you don’t get to choose a storage model for a given T.

Vector{T} is not necessarily optimal for pushfirst! — again, if you want a deque, use a deque. AFAIK neither is “scattered in different chunks”.

FWIW, I don’t find it helpful to think about the data structure “behind” Vector when programming Julia. Arrays are the data structure per se.

FWIW, Julia’s Array type is smart about insertions at either end (the usual tactic of growing the array more than requested is applied to both push! and pushfirst!), but it’s still always a contiguous chunk of memory.

4 Likes

That’s exactly what I needed. Thanks!