[ANN] ParameterSweeps.jl – conveniently represent sweeps over different parameters/values

Hi! I just published my first Julia package. It’s currently waiting for review to be registered (New package: ParameterSweeps v1.0.0 by JuliaRegistrator · Pull Request #150576 · JuliaRegistries/General · GitHub).

The source can be found here:

It’s a very small package providing some convenience tools for constructing “parameter sweeps” (as in: running something for various different combinations of parameters, “sweeping” over whole ranges of them). The functionality could be seen as a convenience layer around the built-in iteration tools (e.g. Iterators.product and Iterators.zip).

I am not sure how many other people out there will find it useful, but my colleagues and I have used it internally for quite a while now and I am reasonably satisfied with it – so I thought I might as well share it. A quick search in the general registry hasn’t turned up any obviously similar packages.

The bigger reason why I’m hoping to register this package and why I’m announcing it here is to learn more about the whole registration process in the general registry and about developing software in the open-source community. I’d be very happy about any kind of feedback regarding the package itself and the whole process around it.


This is the current README:

ParameterSweeps

A small package defining Sweep types to conveniently construct lists (or “sweeps”) of parameter-value combinations, e.g., needed for simulations.

Usage

Simple sweeps

The main type of object to be used from this package is Sweep, which pairs a parameter name with a collection of values, essentially representing a table:

julia> using ParameterSweeps

julia> s1 = Sweep("x1", 1.0:0.1:1.5)
| task id |  x1 |
| -------:| ---:|
|       1 | 1.0 |
|       2 | 1.1 |
|       3 | 1.2 |
|       4 | 1.3 |
|       5 | 1.4 |
|       6 | 1.5 |

The printed table gives an overview of the sweep, which can be useful once the sweeps get significantly larger.
The name “task id” in the table has no relevance, but the numbers act as an index that allows retrieval of a single entry:

julia> s1[3]
(x1 = 1.2,)

Most of the time, one will probably want to iterate over the whole sweep though:

julia> for parameters in s1
           println(parameters)
       end
(x1 = 1.0,)
(x1 = 1.1,)
(x1 = 1.2,)
(x1 = 1.3,)
(x1 = 1.4,)
(x1 = 1.5,)

Combined sweeps

For a single parameter-values combination, these features are not particularly useful, but they become very useful for “combined” sweeps.
Using +, *, or (type \oplus<TAB> in the REPL), sweeps can be combined to bigger tables.

  • + works analogously to zip for two collections
  • * works analogously to Iterators.product
  • works analogously to vcat of tables

Here are some more examples

using ParameterSweeps

s1 = Sweep("x1", 1.0:0.1:1.5)
s2 = Sweep("flag1", [true, false])

# Combine every value of `x1` with every value of `flag1`
s3 = s1 * s2
# | task id |  x1 | flag1 |
# | -------:| ---:| -----:|
# |       1 | 1.0 |  true |
# |       2 | 1.1 |  true |
# |       3 | 1.2 |  true |
# |       … |   … |     … |
# |      10 | 1.3 | false |
# |      11 | 1.4 | false |
# |      12 | 1.5 | false |

s4 = Sweep("filename", "output_$(i)" for i in eachindex(s3))

# Add another column
s5 = s3 + s4
# | task id |  x1 | flag1 |  filename |
# | -------:| ---:| -----:| ---------:|
# |       1 | 1.0 |  true |  output_1 |
# |       2 | 1.1 |  true |  output_2 |
# |       3 | 1.2 |  true |  output_3 |
# |       … |   … |     … |         … |
# |      10 | 1.3 | false | output_10 |
# |      11 | 1.4 | false | output_11 |
# |      12 | 1.5 | false | output_12 |

# Add more rows (just duplicate everything for this example)
s6 = s5 ⊕ s5
# | task id |  x1 | flag1 |  filename |
# | -------:| ---:| -----:| ---------:|
# |       1 | 1.0 |  true |  output_1 |
# |       2 | 1.1 |  true |  output_2 |
# |       3 | 1.2 |  true |  output_3 |
# |       … |   … |     … |         … |
# |      22 | 1.3 | false | output_10 |
# |      23 | 1.4 | false | output_11 |
# |      24 | 1.5 | false | output_12 |

# Get specific index of sweep
s6[21]
# (x1 = 1.2, flag1 = false, filename = "output_9")

# Or iterate
collect(s6)
# 24-element Vector{@NamedTuple{x1::Float64, flag1::Bool, filename::String}}:
#  (x1 = 1.0, flag1 = 1, filename = "output_1")
#  ...

Serialization

The printed markdown table of a sweep is useful for interactive use, but not practical for saving sweeps to be read again programmatically later, e.g., for reproducibility.
A simple serialization/deserialization interface allows sweeps to be written/read from disk, in a format that’s both machine- and human-readable (TOML by default).

using ParameterSweeps

str = serialize_sweep(s)
newS = deserialize_sweep(str)

all(s .== newS) # true

By default, the serialization uses TOML (because it’s a standard library), but one can just as well use YAML or other serializers by converting to/from dicts first using the corresponding “lower-level” functions:

using ParameterSweeps: sweep_to_dict, sweep_from_dict

d = sweep_to_dict(s)

# (de)serialize the dict if necessary ...

newS = sweep_from_dict(d)

all(s .== newS) # true
6 Likes