[ANN] ProcessBasedModelling.jl and ConceptualClimateModels.jl

Hello everyone,

I would like to announce here two packages I’ve just released:

  • ProcessBasedModelling.jl
  • ConceptualClimateModels.jl

The first is an extension to ModelingToolkit.jl. Its goal is to provide an alternative framework for modelling that has a similar goal as MTK’s native “component based modelling”. It has a slightly different conceptual reasoning however that is in my experience more similar to the kind of work I have done so far in my life as a physicist. Instead of components there are processes and each process is an equation that governs a specific variable that is part of the total set of equations representing the physical system.

An important consequence of this modelling approach is that we can know exactly which variables introduce which other variables. This means that error messages from MTK like

ERROR: ExtraVariablesSystemException: The system is unbalanced.
There are 3 highest order derivative variables and 2 equations.
More variables than equations, here are the potential extra variable(s):
 z(t)
 x(t)
 y(t)

become instead

ERROR: ArgumentError: Variable x(t) was introduced in process of variable z(t).
However, a process for x(t) was not provided,
there is no default process for x(t), and x(t) doesn't have a default value.
Please provide a process for variable x(t).

The Example of ProcessBasedModelling.jl compares the two modelling approaches and higlights the ways ProcessBasedModelling.jl can be beneficial.

ProcessBasedModelling.jl split out from ConceptualClimateModels.jl. This is a Julia codebase that I have been using for my personal research as well as for education. It is a library that builds upon ProcessBasedModelling.jl to provide a field-specific framework that allows easily testing different physical hypotheses regarding how climate variables couple to each other, or how climate processes are represented by equations, see its examples to get a rough idea.

ConceptualClimateModels.jl is super stable but has little content at the moment. I will be increasing its content over time but would very much welcome other users working on climate to contribute processes from the literature. The majorest feature coming to ConceptualClimateModels.jl soon is latitudinal models, leveraging MTK’s support for vector-valued symbolic variables.

The READMEs of both packages give a good description of their full content.

7 Likes

Hello,

This looks really interesting! My group is working on a similar (maybe complementary?) effort thinking about how to couple MTK model components that are not hierarchical. The documentation isn’t great yet but there’s an example here: All together · EarthSciMLBase.jl

Like you, we are also working on libraries of domain-specific equation systems built on top of this framework, for example here and here.

Let me know if you’re interested in discussing ways we could coordinate efforts.

Chris

Hi, thanks for the interest. I am always happy to have less packages instead of more, so if we can merge the efforts that would be fantastic.

ConceptualClimateModels.jl does not have any PDE support yet, but it will support latitudinal models once the MTK tutorials on vector-valued symbolic variables come out. Note that you could manually make any PDE and use any of the existing processes for non spatial variables nevertheless.

you can simply add more processes to the list of existing processes here: Predefined processes · ConceptualClimateModels.jl following the instructions from here: Tutorial · ConceptualClimateModels.jl

I am happy with the design, but if you think there is anything missing that prevents you from achieving something simply open an issue letting me know. (beyond PDEs which is already a planned feature)

Hm…I admit I do not understand what this example is trying to achieve. It probably doesn’t help that the animation at the end is (probably?) wrong as it doesn’t show anything besides constan colored heatmaps.

The code is rather complex, as it is significantly more complex than just using MTK directly. Even Base.:+ is expected to be extended directly by the user in the examples, but I don’t think this makes sense to ask for from a front-end user. Yet, despite the apparent increase in complexity versus just using MTK by itself, the package doesn’t make it obvious what fundamentally new things it provides, since MTK already has rigorous support for defining IVP and boundary conditions which appears to be the main discussion point in the docs. Is the ConstantWind the new thing?

This also appears unecessary:

struct ExampleSys <: EarthSciMLODESystem
    sys::ODESystem
    function ExampleSys(t; name)
        @variables y(t)
        @parameters p=2.0
        D = Differential(t)
        new(ODESystem([D(y) ~ p], t; name))
    end
end
@named sys = ExampleSys(t)

why have the user make a type ExampleSys <: EarthSciMLODESystem where clearly what you are defining above is just a function that returns an ODESystem? You could drop the type alltogether or provide an actual API (i.e., set of functions) for the users to use that already implement + for them.

If you want to coordinate efforts in ConceptualClimateModels.jl I’d be happy to help contributions. But the way the API of the EarthSCiML package is right now, I don’t see any obvious way to couple the two packages due to the steep complexity increase in using the latter codebase.

In ProcessBasedModelling.jl I’ll add an AdditionProcess(process1, other_processes...) that makes a new process that has as RHS the addition of all other processes, assumming the LHS of all processes is the same. I think that would help your situation where at the moment you have to manually extend :+?

Thanks for your response. I’m also trying to figure out what ProcessBasedModeling.jl does to understand whether its design goals are similar to EarthSciMLBase.jl or not. If the design goals are the same, then it’s a valid question whether one meets the design goals better than the other one. If the design goals are different, then there’s not much point in comparing.

My understanding is that they both are designed for systems that are not hierarchical as is the case in ModelingToolkitStandardLibrary.jl, but instead are process-based, where there are typically multiple processes acting on the same state variable.

Although I’m not sure that I understand this correctly because the documentation for ProcessBasedModeling.jl talks about the possibility of multiple processes to represent the same physical concept, whereas I typically think about it as multiple processes acting on a given quantity, for example cloud moisture is affected by solar radiation, humidity, temperature, advection, etc.

Anyway, if I do understand that properly, then my understanding is that ProcessBasedModeling aims to provide an improved error message experience for this type of system. And the AdditionProcess would start to get at coupling multiple processes that act on the same variable.

EarthSciMLBase, on the other hand, is meant as a system to allow components that were originally developed as standalone models by people in different fields to be coupled together into a larger-scale model, in a way that is friendly to end users that don’t know how to code beyond something like z = X() + Y(). It also aims to be able to handle situations where there are a bunch of components that all define Temperature or something like that as a parameter, and then one component that defines a Temperature variable that changes in space and time. So it trades off additional developer complexity for more generality and an easier experience for the end user.

Does that sound like a reasonable summary?

(The reason for the apparently unnecessary type is to allow multiple dispatch on the + operator for coupling components. For example if you have a component that defines removal of CO2 and one that defined CO2 generation, if those two processes are different types then you can use multiple dispatch to define a function that makes both of those processes operate on the same state variable, even if it’s named CO₂ in one process and CO_2 in the other one.)

ProcessBasedModelling.jl is a backend, you should instead focus on ConceptualClimateModels.jl. It is indeed a framework for designing climate models following a process-based approach, versus the typical hierarchical one of MTK. So I think you have understood this part correctly.

I see, I’ve just added some sentences in the Tutorial of ConceptualClimateModels.jl to clarify this:

ConceptualClimateModels.jl follows a process-based modelling approach
to make differential equation systems from processes.
A process is simply a particular equation defining the dynamics of a climate
variable, while also explicitly defining which variable the equation defines.
A vector of processes is composed by the user, and given to the main function processes_to_coupledodes which bundles them into a system of equations
that creates a dynamical system.

Note the distinction: a process is not the climate variable
(such as “clouds” or “insolation”); rather it is the exact equation that
defines the behavior of the climate variable, and could itself utilize
many other already existing climate variables.
In terminology of climate science a process is a generalization
of the term “parameterization”.

hence, there is a fundamental distinction between a process (which is a decorated wrapper of an equation) and a climate variable. Many different processes (equations) may describe the behavior of a particular variable and you want to analyze what the model does when using one versus the other process. This is very common in climate science and especially so in conceptual models.

This AdditionProcess (which I have already implemented but forgot about it) is simply a convenience function. A Process is an Equation with decorators that tell you which variable the equation defines. So it is nice to have a convenience to add several RHS components together.

Right, but what is a “model”? And what does “couple” mean? These words mean different things in different context so one needs to make them precise in terms of MTK expressions and symbolic variables.

This gets a bit off-topic but I want to voice a concern here at a fundamental level. First, how reasonable it is to expect from someone working on computational modelling that their programming experience is limited to “only being able to add variables”? Just running installing Julia and packages is already more complex than this. Second, the “+” operation has a very specific meaning, while when I think of “coupling models” simply adding two equations is not really cutting it; coupling is a complex process in itself that would likely require writing down several equations. That would be impossible to represent fully via only “+”.

I did not understand this, can you please explain this in different words?

What I imagine is the scenario where in one process Temperature is an input “variable”, such as a parameterization for ice albedo, where α ~ tanh(T). Temperature itself however is described by a process (equation) that represents a PDE with diffusion so that temperature changes in space and time. Is this what you mean? This is already possible in CCM but I would not use “parameter” to describe temperature in the first equation in this case.

My concern is that I have not seen this, or understood this yet. Can you point to me where in the example the experience for the end user becomes simpler? Perhaps provide a code snippet comparison between using and not using this library? My first impression after reading the example is that my code would be simpler if I just used MTK directly.

ConceptualClimateModels.jl takes a different approach to resolve this problem described in the Tutorial. The library provides some predefined processes. Let’s focus on the CO2 forcing:

co2process = CO2Forcing()

This represents a forcing F ~ ECS * log2(CO2/400) in W/m^2. Here CO2 is a symbolic variable. By default, it uses a predefined symbolic variable with name :CO2 to represent CO2 concentrations. However, the user may instead provide any other variable they have defined to represent CO2 as a keyword:

co2process2 = CO2Forcing(CO2 = myCO2_variable)

Notice that at a fundamental level, at MTK, you can never really use different Symbols to represent the same symbolic variable. If you have :CO2 and :myCO2 then MTK does not support a way to have both of these symbols to truly refer to the same symbolic variable. You’d need to hack your way around it or add the extra equation CO2 ~ myCO2, which I find weird from a modelling point of view; I would never write down such an equation in a paper for example.

That is why I would argue it is better to enforce that all symbols that refer to the same quantity must be identical, and provide the user the necessary convenience to easily handle different quantities.

I hope this helps to clarify more the design decisions behind ConceptualClimateModels.jl.