Understanding distributed computing taxonomy

Hi,

I’m doing my first steps in terms of distributed computing and I was going through the distributed computing page in the julia documentation. I’ve also done a bit of research into parallel computing in general.

The main barrier I have in understanding the julia section on multi-processing is the taxonomy in the text. I get the central point, expressed in the second paragraph: if you have N calculations and X “computers” (here computer might mean CPU, for example), you might be able to able to decrease runtime by assigning N/X calculation to each “computer”.

The trouble starts when words like “process”, “worker”, and “task” start popping up. I have no idea what is the difference between a “worker” and a “process”, nor how each of these relate to the hardware (CPU, CPU core, or logical processor). I have read up on intros to parallel computing more generally, but apparently, the definitions there can be fluid and depend on what language you are discussing.

A more concrete example of my predicament is: what does the addprocs function do exactly? My computer has 6 CPU cores and 12 logical processors. When I do addprocs(5) am I telling julia to set up network of 6 CPU cores? What is the difference between the info given by workers() and nprocs()?

Thanks

2 Likes

workers are the processes doing the work, and are a subset of all processes, which are returned by nprocs():

julia> using Distributed

julia> procs()
1-element Array{Int64,1}:
 1

julia> addprocs(2)
2-element Array{Int64,1}:
 2
 3

julia> procs()
3-element Array{Int64,1}:
 1
 2
 3

julia> workers()
2-element Array{Int64,1}:
 2
 3

So here process 1 is the “main” process that distributes the work to the worker processes 2 and 3.

1 Like