Why use Julia to build an Earth System Model?

Hi all,

Most Earth System Models are written in Fortran, but may be re-written in another language because Fortran has some issues running on GPU based supercomputer.

Some consider using C instead, but there are many arguments to use Julia, as clima.caltech.edu are doing. https://www.nature.com/articles/d41586-019-02310-3

We started a discussion on Slack #climate, but I am posting it here to not lose the thread, and have the possibility to add more arguments.

I personally need a ~10 lines arguments to write in a whitepaper why I believe we should use Julia. (and what ESM might look like in 10 years) Here are my arguments so far:

First, obviously, Julia is fast as C and dynamic (easy to read, shorter) like Python or R.
Secondly, Julia has a large and active AI/ML libraries.
Third, Julia works great with GPU.
Fourth, Julia is more attractive for young scientists.
5th, Julia can be used directly and simply for visualization.
6th, Currently, empiricists and modelers use different languages (R/python vs. Fortran), which is a barrier to communication and collaboration. Julia would solve this.
7th, future ESMs will be modular, a Julia ESM would make it easy to add a module created by an empiricist


Thanks for starting this thread! Definitely agree that it would be good for this discussion to not get swallowed up by Slack.

Just some personal thoughts for why Julia > C/C++/Fortran/Python from using Julia for the past ~2 years for HPC climate work (mostly Oceananigans.jl development which is separate from ClimateMachine.jl but still under the CliMA project):

  1. Pure Julia loop-based code can be fast, easy-to-read, and executes readily on CPUs (multithreaded) and GPUs with KernelAbstractions.jl and CUDA.jl. Possible counter-example could be Python + Numba but I think to get maximum efficiency you still need to write some low-level code in C/Cython?
  2. Pure Julia code means you can interface with a rapidly-growing scientific machine learning software stack: Flux.jl for machine learning, DifferentialEquations.jl for time stepping and sensitivity analysis, Turing.jl for Bayesian inference and uncertainty quantification, etc. and they work well together so Flux.jl + DifferentialEquations.jl = neural differential equations can be trained and embedded in your Julia Earth system model.
  3. Julia is functional so you can pass around functions, use them to specify boundary conditions, forcing functions, etc. which makes scripts look more like math and easier to read. Similar setups in Fortran + namelists can be quite painful: e.g. generate arrays in MATLAB/Python, dump binary data, tell Fortran model to read binary file via namelist.
  4. Script-based approach to simulation setup allows for very flexible setups that would either be very complicated to setup with Fortran models + namelist files. This is more a criticism of rigid namelist files though.
  5. It seems to be much easier to engage and excite new users and contributors with Julia as opposed to Fortran and C/C++. New young students and postdocs seem more likely to learn and use high-level languages like Julia/Python rather than C/C++/Fortran.
  6. Some C/C++/Fortran models are a complete pain to set up and compile, which can be a high barrier for new users. Julia + CUDA + MPI setup hasn’t failed on me yet (it’s really plug and play) and it’s getting better (e.g. CUDA.jl now downloads and installs the CUDA library for you).
  7. Development in Julia (or really any high-level language) feels faster since the code is concise. Once I was familiar with Julia and we had a barebones model up and running, adding new features was quick.

Julia scripts and functions are so much easier to write than in C or Fortran. This allows for customizable processing of model output to be easily written by the user.

Furthermore, plotting can’t be done in Fortran. And customizable plotting is essential in vizualizing and processing ESM output. You can do this easily in Julia, or if you prefer to use Python, you can import Python into Julia as well with PyCall and everything can be so easily done, and Python/Julia syntaxes are similar enough that this is not much of a problem.

You can use Pluto.jl notebooks, which are so much more lightweight than Jupyter notebooks, especially if you require lot of vizualization/plotting, since Jupyter notebooks save the plots into the notebook, while with Pluto you can replot your output. Also interactivity and automatic updates are the best.

An example can be found here (HurricaneVorticity/03-ellipse.jl at main · natgeo-wong/HurricaneVorticity · GitHub), which I created to test out Pluto notebooks (you do need to download the project as a whole though).