Mpirun and julia

question

#1

Hi All,
I try to start julia v 0.5.1-pre on a compute node with two 12-cores Haswells and 128 GB shared memory. OS is Scientific Linux release 6.8. Th julia is compiled with gnu/5.3.0.
If I start julia on more than 15 MPI-Processes, I get the following errors:

$mpirun -n 24 julia  any_script.jl
error during init:
#<null>
.....
signal (15): Terminated

After some research I found out that the error occurs in the initialization phase of julia (_/src/init.c:void _julia_init(JL_IMAGE_SEARCH rel)_ ), namely when loading jl_options.image_file:
_724: jl_restore_system_image(jl_options.image_file)_;

It seems, that one of the following malloc /realloc call returns null pointer in some of the processes. How much memory does Julia need ? Is it possible to control the “bufffers/stack” sizes or is the problem elsewhere?

Thank you
Dmitry


#2

In general, you can use many MPI ranks. For example, I can run the following on a Macbook Air, with 8GB RAM:

michael@yosemite:~/.julia/v0.5/MPI/examples$ mpirun -np 21 julia 07-pi-montecarlo.jl 
reps: 1000000, pihat: [3.14288]
reps: 2000000, pihat: [3.14107]
reps: 3000000, pihat: [3.14198]
reps: 4000000, pihat: [3.14209]
reps: 5000000, pihat: [3.14267]
reps: 6000000, pihat: [3.14245]
reps: 7000000, pihat: [3.14215]
reps: 8000000, pihat: [3.14174]
reps: 9000000, pihat: [3.14195]
reps: 10000000, pihat: [3.14202]
michael@yosemite:~/.julia/v0.5/MPI/examples$

I’m using the same version of julia as you. This is on Debian testing.


#3

Thanks for the response. How big is your swap? There is no swap on the used computed node.


#4

8GB, but I just ran it again with swap disabled.