Continguous memory allocation for numerous instances of compound types


#1

I would like to allocate a large array of a particular type, such that all the data for the array is stored contiguously, i.e., the array stores no pointers.

E.g., suppose I have

type Foo
   a::Int64
   b::Int64
   c::Int8
end


v::Vector{Foo} = Array{Foo}(10000000)

I would like this to effectively allocate 10,000,000 inline Foo objects, such that the actual data for a Foo is stored at v[5] rather than a pointer to a separately allocated Foo. I.e., after the code above, I might just be able to write:

v[5].a=0
v[5].b=0
v[5].c=0

rather than

v[5] = Foo(0, 0, 0)

Is there some way I can achieve this effect? In the worst case, maybe there is an efficient hack I could use where I create my own primitive type and do some bit manipulation to extract a, b, and c. (Of course, that would not be ideal. I’d still be interested in the possibility.)

To be clear, there are 4 things I’m interested in, and would be interested in hearing about progress toward achieving any of them.

  1. Reducing memory consumption by not storing an array of pointers.
  2. Reducing memory consumption by allocating many compound types together rather than individually allocating, e.g., 10,000,000 Foos. (This may already happen at various levels by cool Julia internals, but I’m referring to the metadata that must be stored for allocated chunks of memory.)
  3. Improving efficiency by directly accessing data in the array rather than having to dereference another pointer.
  4. Maintaining the ability to use convenient syntax like v[5].a while doing the above (but I’d be interested in solutions that don’t allow that syntax too).

Thanks.


#2

Use immutable (struct on 0.6) and this will happen automatically.

You won’t me able to modify individual fields like this but there is a PR to enable something equivalent.


#3

Immutables are value types which do exactly what you describe. The type parameter on the array tells it to compile with the values inlined. So as long as you make an array of immutables which is strictly typed, you’ll be a happy camper.


#4

A PR is a something request? Cool. (Yeah, it would be nice to have something mutable.)

Yeah, I guess I should start using “struct”. Weird that I don’t get any deprecation warnings (on 0.6).

The other thing that it would be nice to avoid with the immutables, is everything is copied when you pass them around. …not good if they are very large. Maybe, I want too much…

Thanks.


#5

(edit: I had an original question in this reply about structs with references to other structs of the same type. It turns out that if you have an array of self-referential, immutable structs, the array gets allocated as an array of references. IIn that case, it makes sense to just store an index to the item of interest. There could still be value in having objects that actually have recursive references though because all such objects might not be in the same array.)

I’m still curious if there is any way to achieve any of what I want with mutable structs though.

Thanks.


#6

If you intend to put Foo objects in an array, you could store the index of the next Foo object instead of a pointer to the next object.

(edit: looks like you came to the same conclusion)


#7

Yeah, good call. Thanks adamsic.


#8

I’m not an expert, but I don’t think this is true. Immutable allows the compiler the flexibility to copy or not copy, while a mutable can’t be copied and needs to be passed by reference. Immutables are usually pretty small though, so I think it would copy it onto the stack.

Moreover, a Vector{Foo} is an array of immutables, but the array itself is mutable and wouldn’t be copied when you pass it to a function. It’s just like a Vector{Int64}, but with space for 3 Ints. You can swap any individual Foo with a new instance, but you have to replace the entire Foo, not an individual element of it.


#9

I wrote a macro package on my github page called ModifyField.jl that macro-expands

   @modify_field! v[5].a = 0

into

   v[5] = Foo(0, v[5].b, v[5].c)

This way, you can use immutables and still have the cleaner non-immutable syntax for modifying fields. However, I have not updated this package since about Julia version 0.4.1 because I’m not sure anyone is actually using it, so it may not work any more.