Model configuration/parameterization file

grahamas · February 10, 2018, 10:10pm

Is there a best practice for storing and reading model parameters? (In plain text.)

In Python, I would just use JSON because there’s a fairly straightforward mapping of the JSON syntax to the objects I wanted in Python, and the parsing in and out was essentially one call to the JSON library. In Julia, I’m finding there’s no real way to represent Julia data with JSON, so that I’m ending up with somewhat hack-y code to get the particular values parsed to the right types.

The solution I’m leaning towards now is putting my config/parameters in a separate module. This has the advantages that:

It’s plain text
It’s separate from the model code
It can use any Julia type

Am I missing something? On the one hand, this feels wrong/hack-y. On the other hand, this seems like it’s the perfect solution.

kristoffer.carlsson · February 10, 2018, 10:16pm

You typically put model parameters in a text file instead of the source file because you do not want to relink your executable just to change parameters. In Julia that is not a problem. Instead, you can just create some parameter-struct that takes Julia types, instantiate that in a Julia source file, and pass it along to your solver/simulation/whatever. As you say, this has the advantage that you can just use Julia types all the way, no need for any deserialization.

If you really want to use a textfile, JSON2.jl has some support for automatic serialization/deserialization of Julia types to JSON and you could also look at something like TOML.jl.

grahamas · February 10, 2018, 10:36pm

Ah I just hit the part I was missing/don’t understand: How do I write out the parameters?

My motivations for using a separate file (rather than just initializing the parameter-struct in the same source as the model) are:

To point someone who doesn’t know Julia to where the parameters are defined in that separate file, and tell them not to muck about with the source (ideally eventually having a pre-compiled model, but that’s a separate conversation)
So that when I’m in the REPL, I can just load and compile the model once, and then re-include the parameter file/module as needed.
Most importantly: Whenever I run the model, I create a folder for that run, and save the results/plots in that file along with a parameter file that can be used to recreate that same run.

How would I create that parameter file with this set-up of the parameter file being a Julia source file? (yay macros?)

kristoffer.carlsson · February 10, 2018, 11:01pm

If your model is in a package, then it will not be reloaded upon successive includes. You could for example have something like:

import MyModel

params = MyModel.params(
    weight = 10.0,
    time = 5.0,
    timestep = 0.1,
)

MyModel.run(params)

where MyModel is your package. Editing that should be fairly straightforward and you can just reinclude it to rerun the model with new parameters.

grahamas · February 12, 2018, 4:23pm

That’s a better workflow than what I was describing, thank you!

The thing I still don’t understand is how to write out the parameterization in a way that is both human and Julia readable, after the model has been run.

Meaning, I can definitely see how to make a file like you describe manually, use that to run my code and modify parameters, and also provide it to non-Julia users for the same.

But every time my model is run, I’d like it to write out a new file (human and Julia readable) containing only the particular instantiation of the param-struct used on that run. Is there a good way to do that?

swt30 · February 13, 2018, 2:02pm

I really like Parameters.jl. For each type of problem I’ll define a parameter type that holds the relevant parameters and then I’ll define a function that runs the model when given that parameter type. For example:

using Parameters
@with_kw struct RockyPlanet
    mass = 5.97e24
    core_material = "iron"
    core_fraction = 1/3
    mantle_material = "silicate"
end
function model(r::RockyPlanet)
    @unpack_RockyPlanet r # extracts `mass` etc into local scope
    # run the model code here
end

Anyway, that package defines a show method for each parameter struct that prints the parameter values nicely. For example, doing show(RockyPlanet()) above would print:

RockyPlanet
  mass: Float64 5.97e24
  core_material: String "iron"
  core_fraction: Float64 0.3333333333333333
  mantle_material: String "silicate"

Writing this to a file would achieve the human-readable part of what you need, so we’re halfway there. Perhaps an excellent addition to the Parameters package might be a constructor method that reads the output from such a file and constructs the relevant parameter object? Then your output file would be human-readable-and-modifiable and julia-readable. I’m imagining that such a function call could look like Parameters.fromfile("pars.txt", RockyPlanet) to tell julia to read the data from the file pars.txt into a RockyPlanet struct. You should open an issue or a pull request at Parameters.jl!

grahamas · February 13, 2018, 2:35pm

Ah, perfect! I’ll do both!

mauro3 · February 13, 2018, 3:57pm

I like using a cascade of updates to model-parameters:

With the model, which lives in its own package, I define the types and their default values (usually using Parameters.jl).

Then my simulations often come in groups. Say, using above example, I run a bunch of simulations for a Planet with Uranium core and a certain mass but varying core-fraction and different mantel-materials. I’ll set the default for that group of simulations in a def-para.jl file in the folder of that simulation simulations/uranium-planet like so:

using MyModel
planet_def = RockyPlanet(
  mass = 9e24,
  core_material = "uranium"
)

Then for each simulation I would modify planet_def, say:

include("def-para.jl")
core_fractions = 0:0.1:1
for cf in core_fractions
  pl = RockeyPlanet(planet_def,
    core_fraction = cf)
  model(pl)
end

I think storing and loading then needs to be dealt with separately (and is not trivial). I usually use JLD2.jl. However, what I would like to have, is some way of writing a summary of each simulation to a info text-file. Which I could then open when I’m looking for a particular run.

grahamas · February 13, 2018, 4:30pm

Main point: Does anyone know of an inverse of dump? Where would I make an issue of making an inverse of dump?

Parameters.show just calls dump, so this is the same problem and could be addressed more generally.

I’m getting the feeling that for now I’ll just have to write two parameter files for each run: One plaintext Parameters.show so I can manually inspect the results, and one JLD so I can easily re-run the simulation.

Side comment: Ooh that’s another improvement to my workflow! I didn’t even realize I wanted that to work, but it’s perfect. ~~Why does RockyPlanet(planet_def, corefraction=cf) work? Is that part of Parameters.jl, or is it a language feature of structs?~~ [sigh I should really read the docs more carefully when I use a package… I still don’t know why it works, but it’s documented by Parameters.jl]

swt30 · February 13, 2018, 4:48pm

Yeah, that’s the “copy constructor” introduced by the Parameters macros. So good, right? It’s really nice for sanity-check tests too: I write out a series of things I want to test (uranium core planets are denser than iron core planets, planets with gaseous layers are bigger than bare planets, hotter planets are bigger than cooler planets) and then check each of those by comparing the run on a base set of parameters to a second run with one change.