How to distribute the definitions of preexisting functions to workers?

I think, more or less, I defined my problem in the title. After extensive search, I couldn’t find a solution to it, maybe someone can help me here. So in a nutshell: I have some functions already defined well before I would start any parallel computations. I would like to evaluate these functions using Distributed on a number of kernels, but it is quite painful to collect all the definitions manually and broadcast them again using @everywhere. This is what I am doing now, but I would be really glad if there was some proper solution, where one can distribute definitions of functions like you can do with variables with @everywhere x=$x. If you know Mathematica, I would need here the equivalent of DistributeDefinitions[]. Is there a user-friendly solution to this, or is it there at least on the roadmap to develop this functionality? Thank you for your help in advance.

Put them in a Module/package.

1 Like

Thank you, I thought of this, but then I still need to manually find all the definitions (some potentially changing between two parallel calculations) and then copy-paste them to a separate file that I can include on the workers. What I would consider a solution was something where I could just set up a list with the names of the functions I need, then on demand, after changing some of these definitions, I could just start the parallel kernels and broadcast all the definitions of the functions to them by a command on the list of the names. At the moment, I cannot even see how I could possibly get the source code of the functions that I define in a given session so that I could then export them to a .jl file. (code_typed() is not really useful here.) Not that this would be a very elegant solution, but then one could at least write a script that collects all the source code and saves it in a file that gets included later on the workers. Now I either run the workers non-stop to ba able to define all my functions with @everywhere or I constantly keep tabs on which one I change to update manually every time I need to run a parallel calculation. I can live with this, but it is disappointing that there is on automated way to do this. This is basically just a straightforward copy-paste which can be time consuming, annoying and error-prone for a human but super easy for the machine.

I have just found this Sugar package, maybe that can help to extract the source code. I’ll have a look…

There is an idiomatic solution — use a Module. What you are looking for does not exist because people use modules, so yes, you are making this error-prone and difficult for yourself.

You should not copy-paste, you should write the code in a module in the first place. Also, if

then just use version control, and ideally tag versions of the package.

OK, fine, thanks, I see now what you mean. It’s just that I had been working with Mathematica for many years and I still have to get used to some vastly different concepts. At the moment, I couldn’t judge which one is better, and probably both have advantages and drawbacks, but there are certainly very different approaches and this needs some getting used to when one changes from one to the other. Anyway, thanks again, I’ll see how I can manage with modules then.

Maybe ParallelDataTransfer.jl/README.md at master · ChrisRackauckas/ParallelDataTransfer.jl · GitHub can be of assistance. At least, the code of that package could provide some guidance.

Many thanks for this, I’ll have a look at that.

But otherwise, now that I see that the solution I was looking for is not the mainstream approach of the programming language and it is so much not supported, I rather give up hacking it and go towards the least resistance. I will just adopt the way of thinking of Julia and switch to using modules if that’s the way people do it here.

See also: Easy way to send custom function to distributed workers? - #2 by greg_plowman

1 Like

Thank you, this actually works! Although after broadcasting the definition once, it doesn’t seem to update it any more. So if I change my function, the definition remains the same on the workers even if I execute those lines again. But thanks anyway.

This is expected, no link between the workers are established by sending the definition. The same goes for @everywhere, which runs code on all workers. The code runs once and no links handling updating are established.

1 Like

I wrote a short technical report for my coworkers to get started with distributed computing in Julia, you might find it useful: Parallel computing in Julia : Case study from Dept. Automatic Control, Lund University | Lund University Publications

4 Likes

Excellent! Thank you very much. I have saved your report for later reference.