How to parallelize within a function

Uh… I don’t think this is true yet :sweat_smile: actually. Some things (e.g., Array) work great but many things are hard to reason about the safety (e.g., sparse matrix). Importantly, there’s no documentation on what is safe when.

Sorry to nitpick, but I’d be careful about such a claim. For some definitions of “true threads”, one can argue Python has “more true” threads than Julia. For example, Python has a more transparent OS thread API than Julia; Julia only has tasks (and that’s kinda the point). But unfortunately to Python programmers, it was designed in 90s where threads for parallelism were not a thing (at least not for everyone) and so it’s useful mainly for I/O or GIL-releasing external code. On the other hand, Julia has “more true” threads than Python in another sense if one defines “threads” as a synonym of shared-memory parallelism.

If you have a rough idea of the applicability of process-based parallelism in the problem you have, I think starting from your comfort zone sounds like a good idea.

Going back to the problem in the OP, it’s not a good idea to use @everywhere inside a function. It’s mainly for “static things” like using Package and include("script.jl"). You’d probably want to use remotecall here (and maybe iterate over the worker ids returned from workers()).

Also, please quote your code: Please read: make it easier to help you

4 Likes