In my research, I often find myself incrementally creating large graphs or trees where each node has some properties that might get updated later.
Semantically, it makes the most sense for each node to be represented by a mutable object. But, since mutables are heap-allocated, this is very slow. What I want to do is to set aside a big chunk of memory for these nodes (essentially another stack). Then, when I create a new one, it should be trivial to find space to allocate it at the end of the occupied portion of this chunk (also perhaps there should be special gc behavior).
Is this possible now? Is it a change that could be added? (As far as I can tell, it would not change any semantics, except right where the object is constructed)
Another way of saying this is that I want a single special source vector, where the mutable objects are actually stored contiguously in memory like immutables are, but any other reference to them is just a pointer.
(Currently I use an Int to represent each node and a vector for each of the properties (IIUC what MetaGraphs does), but this results in much harder-to-read code)
What you request is basically inline-stored mutable struct, which is currently not officially supported by Julia.
Before you try to design your own memory pool in Julia, note that Julia already has it’s own memory pool, so if you believe that having your own pool might make allocation (much) faster, you might be disappointed, unless your allocation has some special patterns.
But if you just want inline-stored mutable, you can twist pointers in Julia, just like in C++/C (Warning: this is as dangerous as in C++, before you copy the following codes, wall through them carefully). Let’s assume that you will to implement a linked list, which is essentially:
mutable struct Node
x::Int
next::Node
end
To store it in a contiguously, try following code:
Firstly we define a immutable struct Node. It will be stored inline in the array. Then we define a MNode, which is essentially a pointer to Node.
# wrap Node as a mutable struct
struct MNode
ptr::Ptr{Nothing}
end
struct Node
x::Int
next::MNode
end
then define getter and setter to make life easier. So when we mutate or get the field of MNode, we are mutating or get the field of the content (that is, Node) it points to.
# delegate the getproperty to underlying Node
function Base.getproperty(p::MNode,i::Symbol)
n = Base.unsafe_load(Base.unsafe_convert(Ptr{Node},getfield(p,:ptr)))
getproperty(n,i)
end
# delegate the setproperty to underlying Node
@generated function _setproperty!(p::MNode,::Val{N},x::V) where {N,V}
T = Node
for i = 1:fieldcount(T)
if fieldname(T,i) == N
if fieldtype(T,i) != V
return :(error("Type $T has a field named $N, but type of it is not $V"))
else
off = fieldoffset(T,i)
return :(Base.unsafe_store!(Base.unsafe_convert(Ptr{$V},getfield(p,:ptr)+$off),x,1))
end
end
end
:(error("Type $T has no field named $N"))
end
function Base.setproperty!(p::MNode,name::Symbol,i) where T
_setproperty!(p,Val(name),i)
end
Finally, build an allocator, which is essentially a stack of Node. When we allocate from the allocator, we return a MNode which points to the Node. Be careful of deallocate!, it does nothing more than substrate the stack pointer! (Note, if you deallocate element in arbitrary order, you need to implement your own memory allocator)
mutable struct NodeFactor
# don't resize v! since it will cause element reallocating
v::Vector{Node}
stackhead::Int
size::Int
ptr::Ptr{Node}
function NodeFactor(c::Int)
# create a large vector
v = Vector{Node}(undef,c)
# get the underlying pointer of vector
ptr = Base.unsafe_convert(Ptr{Node},v)
return new(v,0,c,ptr)
end
end
function allocate!(n::NodeFactor)
if n.stackhead > n.size
error("run out of space")
end
# return new pointer
newptr = n.ptr
# move stack pointer to next place
n.stackhead += 1
# update pointer to next place
n.ptr += sizeof(Node)
return MNode(newptr)
end
function deallocate!(n::NodeFactor)
if n.stackhead == 0
error("double free!")
end
n.stackhead -= 1
n.ptr -= sizeof(Node)
return
end
All that is very interesting and instructive, thanks. But maybe it is important to mention to the OP that in this example, an perhaps in his example as well, it is much easier to just let this Node to be immutable, store an array of those and just replace the entire node instead of mutating it.
I would suggest posting more details about the structure of the problem, to obtain more specific suggestions.
An immutable struct containing a mutable-static vector maybe, with overloaded getindex, or getter functions can be practical, maybe.
struct Node
index::Int
properties::MVector{2, Float64}
end
myfirstprop(n::Node) = n.properties[1]
etc
Then MVetor is again allocated on heap. This is not working.
The core problem here is that, each mutable struct has their own identity, so if their lifetime is undetermined, that is, longer than a function call, they must be managed by GC and allocated on heap individually, or you manage memory manually.
Trying to learn here. What would be the disadvantage of a vector of immutable structs? Only if the number of fields is very large I see a possible problem, otherwise just replacing the “nodes” in the vector by new ones seems good enough, isn’t it?
(There are some packages like StrideArrays.jl that appear to aid the management of large mutable arrays in the heap, but I don’t know much about them)
Syntactically, this is very close to what I am looking for - thank you! But I think overloading setproperty is somewhat misleading and there is a lot of overhead for creating new types. If I allow myself to overload setproperty, there are other solutions that don’t involve pointers. What I really want is something to work like this:
mutable struct MyNode
x::Int
next::MyNode
end
p = SpecialPool(MyNode, 1000)
n = @alloc_hint p MyNode(1)
Yeah, you may be right. It seems like a big block of objects of the same type that all get deleted at the same time would be a pattern that could benefit. I do not know enough about garbage collection to know if this would actually help though.
Dose this package actually work for mutable types? I don’t think so. Mutable types have different memory layouts. Use Ptr{MutableTypes} won’t work as you might expected…
Also, grow is implemented incorrectly. It doesn’t even try to free the memory. As long as you try to implement your own grow function, you can quickly see why mutable types should not be placed inline in array. If they are placed inline, you can’t grow the array by simply replacing the old one by allocating a new one, because you can’t move the mutable types to the new array (unless there is no other values holding the pointer to them, but how can you know that). You can copy immutable values but not mutable ones.
Instead, you should keep pages of memory. When you grow the allocators, you allocate a new page if all pages are occupied. However this will also cause problems, because you need to scan the pages to determine whether there are free pages. How to implement a fast and correct one allocators is not easy…