Hello,
I’m using fetch
to communicate large arrays between nodes on a compute cluster. This results in more memory allocation than I would like. Is there an in-place version of fetch
that uses pre-allocated memory? Here is an example to illustrate:
addprocs(2)
function main()
f1 = @spawnat workers()[1] rand(1000)
f2 = @spawnat workers()[2] rand(1000)
x = zeros(1000)
x[:] = fetch(f1) # I would like to do something like fetch!(f1,x)
y = sum(x)
x[:] = fetch(f2)
z = sum(x)
@show y,z
end
main()
If I understand correctly, both calls to fetch
will allocate memory for an array that is the same size as x
. Subsequently this memory is copied from the Future
to x
. This extra allocation and copy is what I want to try and avoid.
Thanks for taking time to read this, and for any ideas towards a solution?
Sam
This would be a good feature to have.
Currently I think you will need to manage the buffers and write your own serialize/deserialize functions. It is a bit simpler if the type and dimensions of the arrays are fixed - something like
global const buffers = Vector{TYPE}[]
type FooVector
arr::Vector{TYPE}
end
Base.serialize(s::AbstractSerializer, data::FooVector)
Serializer.serialize_type(s, typeof(data))
write(s.io, data.arr)
end
function Base.deserialize(s::AbstractSerializer, t::Type{FooArray})
buffer = isempty(buffers) ? Vector{TYPE}(SIZE) : pop!(buffers)
readbytes!(s.io, reinterpret(UInt8, buffer))
return FooVector(buffer)
end
Your code should wrap and send vectors as `FooVector` objects.
Also either install a finalizer that will `push!` back `FooVector.arr` to `buffers` when done or do it manually.
For a more generic implementation (any type/shape of bitstype arrays) you should serialize type/shape/size information and handle it appropriately. Functions for serializing and deserializing arrays in base/serialize.jl
will give you an idea of various use cases.