How exactly are Julia Arrays Implemented?

I’m curious how exactly Julia arrays are implemented.

julia> t = Array{Float64, 2}
Matrix{Float64} (alias for Array{Float64, 2})

julia> isabstracttype(t)
false

julia> isprimitivetype(t)
false

julia> isstructtype(t)
true

julia> fieldnames(t)
()

Where is the data stored if there are no fields?

1 Like

They are implemented in C. It would be nice to move more of the complexity to the Julia side eventually, but no one has gotten around to doing it yet.

7 Likes

In C:

See also: What is special about Array, String, and Symbol?

8 Likes

Would there be major benefits to be had? I imagine array dimensions would move into Julia’s type system—would this be valuable for dispatch or bounds checking?

1 Like

There would be a 2 main benefits.

  1. faster empty array construction. Right now Constructing a Float64[] takes about 20ns and it should take more like 6ns (the difference is because we aren’t able to constant propagate some of the information across the language barrier).
  2. The implementation would involve adding a C type for a fixed sized mutable buffer which would be a better thing to use for lists that we use in other data-structures (e.g. Dict) that currently have some memory overhead as a result of the multi-conditionality and resizing that Arrays support.
16 Likes

This seems like something that could have been mildly more ergonomic if Base Arrays were implemented in Julia (docstring for StaticArrays.jl SMatrix):

SMatrix{S1, S2}(mat::Matrix)

  Construct a statically-sized matrix of dimensions S1 × S2 using the data from mat. The parameters S1 and S2 are
  mandatory since the size of mat is unknown to the compiler (the element type may optionally also be specified).

No, the distinction between arrays with a runtime size (e.g. the built-in Array type) and a static/compile-time size (StaticArrays) is a semantic choice that has nothing to do with whether it is implemented in pure Julia. We don’t want Array to have a static size (part of the type), because that severely limits what you can do with it. StaticArrays are great but are much more specialized.

5 Likes

Thanks, I see. I had given myself the impression that, as multidimensional arrays’ size is immutable, it might be something that’d be desirable to track in the type system or dispatch on. But I am a dilettante here.

It is stored in the type system, but we also store it in the datatype (because it is faster to get that way and the sizes work out to make it basically free).

1 Like

If you are interested in viewing Julia’s C array structures, I recently wrote a wrapper in Undefs.jl:

Here’s a demonstration.

julia> using Undefs: JLArray

julia> A = Array{Int}(undef, 5, 6)
5×6 Matrix{Int64}:
 0  0  0  0  0  0
 0  0  0  0  0  0
 0  0  0  0  0  0
 0  0  0  0  0  0
 0  0  0  0  0  0

julia> jla = JLArray(A)
JLArray{Nothing}:
   data: Ptr{Nothing} @0x00007f8300821800
 length: 30
  flags: 1000100000001000
        how: 0 (data is inlined, or a foreign pointer we don't manage)
      ndims: 2
     pooled: true
   ptrarray: false
   isshared: false
  isaligned: true
 elsize: 8
 offset: 0
  nrows: 5
  ncols: 6
  other: nothing

julia> B = vec(A);

julia> jla_B = JLArray(B)
JLArray{Nothing}:
   data: Ptr{Nothing} @0x00007f8300821800
 length: 30
  flags: 1100100000000111
        how: 3 (has a pointer to the object that owns the data)
      ndims: 1
     pooled: true
   ptrarray: false
   isshared: true
  isaligned: true
 elsize: 8
 offset: 0
  nrows: 30
maxsize: 30
  other: nothing

julia> jla = JLArray(A)
JLArray{Nothing}:
   data: Ptr{Nothing} @0x00007f8300821800
 length: 30
  flags: 1100100000001000
        how: 0 (data is inlined, or a foreign pointer we don't manage)
      ndims: 2
     pooled: true
   ptrarray: false
   isshared: true
  isaligned: true
 elsize: 8
 offset: 0
  nrows: 5
  ncols: 6
  other: nothing

5 Likes