How to avoid shipping large object to remote workers repeatedly

I’ve got some large object dataset which is very slow and memory intensive to serialize/deserialize to ship to remote workers, so I’d like to do it just once. However, this object gets used from some library function foo I don’t control which looks like:

function foo(pool, dataset)
    pmap(pool, 1:10) do i
        # do stuff with dataset
    end
    pmap(pool, 1:10) do i
        # do other stuff with dataset
    end
end

So even if I used CachingPool it gets sent twice since those are different closures and CachingPool only caches within the same closure. Is there any way to make this work?

If this were happening in the global scope, Julia’s auto global shipping would essentially be exactly what I need, but unfortunately this is inside a function.

Probably not a great solution, but i have solved this once by writing a custom serialization, which was fast. I have found this to be such an interesting message, that I have written a small section to notes to our lecture: Lecture · Scientific Programming in Julia