Entry point for contributing to climate modeling projects?

I’d like to spend some of my free time contributing to projects that help address the issue of climate change. My background is that I have (basically) no domain knowledge at all, but I have julia and modeling/simulation experience (mostly astrodynamics and estimation).

I found this list of “Open source projects sustaining stable climate, energy supply and vital natural resources”, which has a number of julia projects on it. I’ve spent some time looking into each one, but I have yet to find a good entry point for contributing. I’m hoping that someone here that is more familiar with the ecosystem can help point me in the right direction, and potentially point out areas that could use attention from some motivated programmer. Any advice?


Maybe check out CliMA:

They also had a post a while back about a software position:


Allthough not about climate modelling, but climate data analysis, ClimateBase.jl is a repository that is great for newcomer contributions. It recently has been overhauled and there are a bunch of low hanging fruits. The source code is small, self-contained, and (to my judgement) easy to understand.


I didn’t mean for my title to be exclusionary, climate analysis projects are just as interesting to me :slight_smile:. Thanks for the tip, I will definitely check this out.

Cool that you want to contribute for this reason! This article from 2015 has been linked before, but it also represents some of my motivation to contribute to JuliaGeo:

Even besides climate modeling, I feel like any kind of tool that will make it easier to understand the Earth is going to be very much in need in the coming years.


That’s a great article, and I found it very inspiring when I first read it a couple years ago. In fact, I started my (current) search by re-reading it.

We at CliMA would certainly welcome contributions, though I do concede that it is probably not that easy to dive right in and start contributing to the main ClimateMachine.jl repository (though if you do have ideas, please let me know).

That said, there are lots of areas where contributions would be very welcome, and could benefit the wider Julia ecosystem:

  • file readers for common data formats (issue #114). There are a wide variety of file formats used for climate data (e.g. NetCDF, Zarr, HDF5): although most have a Julia package, the quality and maintenance of them varies considerably. Some of these rely on cumbersome third-party binary dependencies, and could benefit from being translated into Julia to benefit from features such as memory mapping, and parallel I/O.
  • visualization: we heavily use Paraview and VisIt for visualizing the model output. Being able to directly interface Julia with these libraries (e.g. for in situ visualization) would be incredibly cool. Alternatively, building something similar in Makie.jl would be very neat (though perhaps a lot more work).
  • Specific numerical topics: we’re currently working on moving out our distributed-memory time steppers and Krylov & Newton-Krylov solvers into self-contained repositories.
  • GPU and distributed memory tooling. We heavily make use of CUDA.jl, KernelAbstractions.jl and MPI.jl, so improvements to those directly help our project. Similarly, we are excited about AMDGPU.jl so that we can run on alternative GPU architectures.
  • There are many rough edges in running Julia on HPC clusters (deployment, binary dependencies, etc.). If you are interested in this topic, please join our discussion.

This is fantastic information, thank you. Indeed, I spent some time clicking around the CliMA organization, but I couldn’t quite figure out a good way to get started.

1 Like

If anyone is interested but would like some pointers on where to start, please feel free to DM me.

Welcome @crbinz!

I think that there’s a lot of good information here in this thread already.

Overall, I think that the whole climate science domain is a natural candidate for Julia, with respect to two main axes:

  • Modelling is costly. In that respect, ClimateMachine.jl is an awesome “showcase card” for the language.
  • Climate analysis deals with Big Data… way before the term was coined.

The big advantage right now of Python is the easy data access and scalability with xarray and dask. Each time I advocate Julia to my colleagues, they always ask : can I easily process 100 TB of data with a simple script and scale that to my cluster? I’d say that we are close, but not there yet, at least for the persons that are not developer-inclined (e.g. people writing 2000 lines of script code in Matlab/Python for their current paper).

There are solutions though, but the documentation is sparse. For example, for xarray-like approach, I succesfully tested ESDL.jl and YAXArrays.jl (not a big fan of the name though!). I was able, using ESDL.jl, to scale computations on a slurm cluster.

Anyway, that was some random thoughts.