How to distribute the definitions of preexisting functions to workers?

Babber · March 19, 2019, 8:35pm

I think, more or less, I defined my problem in the title. After extensive search, I couldn’t find a solution to it, maybe someone can help me here. So in a nutshell: I have some functions already defined well before I would start any parallel computations. I would like to evaluate these functions using Distributed on a number of kernels, but it is quite painful to collect all the definitions manually and broadcast them again using @everywhere. This is what I am doing now, but I would be really glad if there was some proper solution, where one can distribute definitions of functions like you can do with variables with @everywhere x=$x. If you know Mathematica, I would need here the equivalent of DistributeDefinitions[]. Is there a user-friendly solution to this, or is it there at least on the roadmap to develop this functionality? Thank you for your help in advance.

Tamas_Papp · March 20, 2019, 11:01am

Put them in a Module/package.

Babber · March 20, 2019, 2:24pm

Thank you, I thought of this, but then I still need to manually find all the definitions (some potentially changing between two parallel calculations) and then copy-paste them to a separate file that I can include on the workers. What I would consider a solution was something where I could just set up a list with the names of the functions I need, then on demand, after changing some of these definitions, I could just start the parallel kernels and broadcast all the definitions of the functions to them by a command on the list of the names. At the moment, I cannot even see how I could possibly get the source code of the functions that I define in a given session so that I could then export them to a .jl file. (code_typed() is not really useful here.) Not that this would be a very elegant solution, but then one could at least write a script that collects all the source code and saves it in a file that gets included later on the workers. Now I either run the workers non-stop to ba able to define all my functions with @everywhere or I constantly keep tabs on which one I change to update manually every time I need to run a parallel calculation. I can live with this, but it is disappointing that there is on automated way to do this. This is basically just a straightforward copy-paste which can be time consuming, annoying and error-prone for a human but super easy for the machine.

Babber · March 20, 2019, 2:33pm

I have just found this Sugar package, maybe that can help to extract the source code. I’ll have a look…

Tamas_Papp · March 20, 2019, 2:52pm

There is an idiomatic solution — use a Module. What you are looking for does not exist because people use modules, so yes, you are making this error-prone and difficult for yourself.

You should not copy-paste, you should write the code in a module in the first place. Also, if

then just use version control, and ideally tag versions of the package.

Babber · March 20, 2019, 3:10pm

OK, fine, thanks, I see now what you mean. It’s just that I had been working with Mathematica for many years and I still have to get used to some vastly different concepts. At the moment, I couldn’t judge which one is better, and probably both have advantages and drawbacks, but there are certainly very different approaches and this needs some getting used to when one changes from one to the other. Anyway, thanks again, I’ll see how I can manage with modules then.

baggepinnen · March 20, 2019, 3:43pm

Maybe ParallelDataTransfer.jl/README.md at master · ChrisRackauckas/ParallelDataTransfer.jl · GitHub can be of assistance. At least, the code of that package could provide some guidance.

Babber · March 20, 2019, 4:26pm

Many thanks for this, I’ll have a look at that.

But otherwise, now that I see that the solution I was looking for is not the mainstream approach of the programming language and it is so much not supported, I rather give up hacking it and go towards the least resistance. I will just adopt the way of thinking of Julia and switch to using modules if that’s the way people do it here.

greg_plowman · March 21, 2019, 1:45am

See also: Easy way to send custom function to distributed workers? - #2 by greg_plowman

Babber · March 21, 2019, 2:51pm

Thank you, this actually works! Although after broadcasting the definition once, it doesn’t seem to update it any more. So if I change my function, the definition remains the same on the workers even if I execute those lines again. But thanks anyway.

baggepinnen · March 21, 2019, 3:30pm

This is expected, no link between the workers are established by sending the definition. The same goes for @everywhere, which runs code on all workers. The code runs once and no links handling updating are established.

baggepinnen · March 21, 2019, 3:34pm

I wrote a short technical report for my coworkers to get started with distributed computing in Julia, you might find it useful: Parallel computing in Julia : Case study from Dept. Automatic Control, Lund University | Lund University Publications

Babber · March 21, 2019, 3:43pm

Excellent! Thank you very much. I have saved your report for later reference.

Topic		Replies	Views
How to broadcast already defined function to all worker processes in jupyter？ General Usage jupyter , distributed	1	447	April 25, 2020
Functions in @distributed for do not work Julia at Scale	3	2145	February 13, 2019
Making code and packages available to workers inside module General Usage parallel , module , distributed	4	794	March 2, 2021
Distributed computing for functions in scripts inside a local module? Julia at Scale question , parallel , distributed	7	551	May 5, 2023
Distributed code loading inside modules General Usage question , parallel , module , distributed	8	1162	July 15, 2021

How to distribute the definitions of preexisting functions to workers?

Related topics