It is very beneficial to talk with you guys about Julia.
I noticed that when you guys talking about prevent using global variables, many of you repeatedly mentioned that it can be thread safe not to use global variable.
So obviously my understanding that, is it that in Julia, it is very easy to use some command such that the code runs like OpenMP? I mean shared memory parallelization?
My understanding is that the OpenMP like stuff may be helpful for parallelizing some loops.
On the other hand, why not using MPI? If we are just using one CPU with its multiple threads, perhaps the speed of MPI is the same as OpenMP.
I come from a background with quantum Monte Carlo, I know that for Monte Carlo, it is called embarrassingly parallelized or something like that, because like the code can be almost >99% of time fully parallelized because each worker can simply work independently, and they do not need to communicate all the time. They only need to communicate occasionally so that the data are collected and block averages can be calculated.
So I am interested about how MPI works in Julia nowadays? Is it convenient to use MPI in Julia now? is there some project running on supercomputers written in Julia? It seems majority of them are still written in Fortran?
you can spawn threads that run on multiple core without MPI. For example, if you’re on a quad core CPU and you’re doing CPU-bounded task, you want to let julia to use 4 threads, each thread will use one CPU core to its full capacity. (despite that hyperthreading allows two threads on each core, it can be counterproductive when the task is CPU-bound)
big_matrix = ...
Threads.@threads for r in regions
you can do operation on non-overlapping parts of arrays like this, or if you’re talking about embarrassingly parallel work
res = [Float64 for 1:Threads.nthreads()]
Threads.@threads for i in blah
idx = Threads.threadid()
or something like mapreduce:
result = mapreduce(fetch, merge, [Threads.@spawn simulation(N) for i=1:10])
Multi-thread has a specific meaning in Julia that is different from multi-processing, see Announcing composable multi-threaded parallelism in Julia. Due to the fact that it allows you to share memory, I guess it’s similar to openMP.
I also strongly recommend this read: A quick introduction to data parallelism in Julia
I don’t think we’re going to stop using MPI anytime soon - there’s MPI.jl for running in a cluster environment that’s using MPI, but there are also custom runners using ClusterManagers.jl for just using Julia capabilities for multiprocessing (which work well and are less confusing to me than MPI), interfacing with the cluster directly. This also takes advantage of julia native primitives like
For a small part of what kinds of research is done using julia, I’d encourage you to check out Research as well as Case Study - Julia Computing (in particular I’d recommend checking out Celeste.jl, the project that landed julia in the petaflop club).
In addition to all of the above, I’d also recommend checking out the sections on parallelism, threading as well as distributed computing in the manual:
In summary, multithreading (OpenMP) and distributed computing (MPI) are different paradigms, each with different usecases, advantages and disadvantages.