I am trying to understand the definitions of “multiprocessors” and “multicores” and it seems different sources have different definitions. I am mainly confused by whether the terms are defined by the numbers of physical CPUs, physical cores, or logical cores. For instance, in Julia’s multi-processing document, it says “Most modern computers possess more than one CPU,…” I assumed it means one physical CPU (one socket) with multiple cores, but I am not sure.
Let me use an example to illustrate. Suppose there are two personal computers each has only one CPU socket.
Computer A: A quad-core i9 with hyper-threading (HT) technology; 4 physical cores in the die, 8 logical cores in total.
Computer B: A hypothetical single-core Intel CPU with HT; 1 physical core in the die, 2 logical cores in total.
Question 1: Which one is a multiprocessor computer?
(1) Only A, because “processor” here is defined on physical cores.
(2) Both A and B, because “processor” could be defined on logical cores.
(3) None of them, because “processor” here is defined on physical CPU.
Question 2: Which one is a multicore computer?
(1) Only A, because here the “core” is defined on physical cores.
(2) Both A and B, because the “core” is defined on logical cores.
If both answers are (2), are there meaningful differences between the terms “multiprocessor” and “multicore”?
Even if you were supplied supposed definitions of these terms, it would do you no good, because people use them routinely to mean anything they want, without defining what they mean, as you’re discovering. You have to guess the intended meaning from the context…and then you’ll find that the usage shifts from sentence to sentence in any case.
Fortunately, it often doesn’t matter much when learning how to use Julia’s parallel processing utilities, as they work transparently over logical and physical cores and actual multiple processors.
Thanks for the reply. It seems there is no clear consensus on how the terms should be used. I guess with the rapid development in hardware and software, definitions of the terms are shifting and sometimes needing clarification. Still, the situation causes confusion to beginners who try to understand what people are talking about.
Multiprocessor is an older term, and applied to machines that had multiple single-core packages. I don’t know when it was first used, but here are documents from the 80s that use it:
Thanks for the information. It provides a useful angle to view these terms.
Though “multiprocessor” is old and outdated, “multiprocessing” is very much in trend. My making a false connection between the two has confused me for a while.
This is further confused by “hyperthreading” which is a technology where a single CPU core can execute multiple concurrent(ish) threads of execution, thereby hopefully achieving better performance. Unfortunately, it’s not uncommon for compute-intensive workloads to get slower when hyperthreading is enable (which it typically is by default). We’ve found that Julia does surprisingly well with hyperthreading (anecdotal, would love better data), however, so you need to measure to see which works better for you.
Generally, “cpus”, “processors” or “physical cores” mean the same thing and refer to the number of non-hyperthreading CPU cores that are available. This number can be aa little hard to discover programmatically, unfortunately. On the other hand “hardware threads” and “logical cores” refer to the number of concurrent execution contexts with hyperthreading.
Regarding “multiprocessing” versus “multicore”, sometimes those terms are used to distinguish whether multiple cores/threads are utilized by spawning multiple communicating processes—multiprocessing, typically one thread per process—or by using multiple threads in the same process—multicore, typically using all available cores in a single process.
Regarding “multiprocessing” versus “multicore”, sometimes those terms are used to distinguish whether multiple cores/threads are utilized by spawning multiple communicating processes—multiprocessing, typically one thread per process—or by using multiple threads in the same process—multicore, typically using all available cores in a single process.
Thank you; that’s helpful. It shows to me that these terms are not only used in describing hardware (my original question), but also in describing how works are done with the hardware in parallel computing. In the context of the latter, multiprocessing is used to mean the (more or less) same thing as distributed computing, and multicore is multithreading (or multi-threaded) parallelism. Hope I got it right this time.
Multiprocessing may or may not be distributed, but my impression is that when people mean distributed they’ll usually say so. Absent additional context I would guess that “multiprocessing” means multiple processes on a single shared memory system.
Would you be willing to elaborate on this a bit? I guess that the second part of your sentence implies that “it depends”, however, maybe there are some generalizations that you could share based on your experience? Also I am wondering are there any particular differences worth mentioning related to multi-threading vs multi-processing w.r.t. hyperthreading?
Absent additional context I would guess that “multiprocessing” means multiple processes on a single shared memory system.
It seems in the context of parallel computing, multiprocessing and distributed computing may be used interchangeably, as shown in Julia’s document. I guess multiprocessing here means starting multiple instances (i.e., processes, by addprocs(n)) of Julia to do the work. If so each one would have its own memory (hence distributed memory) even in a single pc.
Also I am wondering are there any particular differences worth mentioning related to multi-threading vs multi-processing w.r.t. hyperthreading?
I’ll give my 2 cents here. On a single machine, my understanding is that there is no difference. Suppose my CPU chip has m physical cores; that means 2*m logical cores with Intel’s Hyper Threading. To use multi-threading (such as via Fold.jl), you could start a Julia instance with n threads (e.g., julia --threads=4) where n is not constrained by 2*m (could be larger or smaller). Similarly, using multi-processing via Distributed.jl, you could have addprocs(n) where n is not constrained by 2*m.
It’s another issue of the optimal choice of n. My experience with multi-threading on a single machine is that n\approxm often gives the best results. Not sure about multi-processing though, and I guess the answer also depends on the type of work to be parallelized.
If you’re running code with a lot of allocations, multiprocessing will currently scale better than multithreading.
If you see 80%+ gc times in threaded code and can’t easily cut down on allocations, you can try switching to multiprocessing
This is a common problem when running on 18 cores, but shouldn’t be much of an issue with only 4.