How to allow users to safely change package "constants"?

I have a package MyPkg with some numbers that act as settings which for the most part are “constants”. But I would like users to be able to specify some variations of these so I can not define them as const. I’m thinking I’ll use functions. So if I want users to be able to specify the speed of sound or the data-storage format for some use, I might do something like:

soundspeed() = 344
dataformat() = Float32

I pass these values to other functions mostly in their arguments, but may sometimes use them directly in the function body. Examples:

function f(x, speed = soundspeed())
	...
end
function g(x)
	container = dataformat()[]
	...
end

Then, users can do MyPkg.soundspeed() = 343 and continue to use the package.

Is this a good way to implement the flexibility I desire?

I considered using Refs, but I think I was getting into type-stability / type-inference issues. For example, if I want to allow dataformat Float32 and Float64, I’d need to create a dataformat = Ref{Type{<:AbstractFloat}}(Float32).

I considered creating a struct like Settings where I would store all my constants. This is useful because the constants are interdependent. For example, if the setting we_are_in_outer_space is true, then soundspeed must be 0. I could validate these interdependencies before returning a Settings object. The issue was that I still needed the fields of Constants to have abstract types. I think I can still do my validation checks with the above approach-- just need to do my checks against we_are_in_outer_space() inside soundspeed() before returning. In the above approach, the user would also need to do a validation check after updating the settings.

My concern is that because of things like inlining and constant proagation-- things I don’t understand-- the settings might get hard-coded. I’ve done a few tests and it doesn’t seem to be happening, but I believe Julia has some leeway on when it decides to do that stuff so I remain unsure.

I think if the user redefines these functions, other methods might get invalidated and need to be recompiled. That is okay, though if there is a way to ship the package with a few versions of these settings precompiled and cached, that would be nice.

maybe ScopedValues is a useful idiom here?

1 Like

This is OK for interactive use, but not much more than that. It’s type piracy.

The first thing to consider is whether the “package constants” can simply be taken from the user as the arguments of a function or constructor. This could even happen via the type domain, e.g., as the type parameter of Val.

If this isn’t a viable solution, perhaps consider relying on Preferences.jl or scoped values.

The “hardcoding” will never cause correctness issues, if I get you right. You have to worry about invalidation and recompilation, though.

2 Likes

Yes, I am primarily concerned about correctness. If invalidation/recompilation is the worst that can happen, I’m fine with that for now. ScopedValue looks above my technical ability at the moment and would be a fun rabbit hole to go down. I’ll save that for later.

Thanks for your quick responses.

1 Like

Just saw that @nsajko mentioned that MyPkg.soundspeed() = 343 is type piracy. I believe it is piracy against MyPkg and not Base. If that’s the case, this will be a documented and allowed type piracy.

Or you can just use a unified Options structure, like

@kwdef struct Options{T}
    soundspeed::T = 344.0
    lightspeed::T = 3e5
end

function f(x; options=Options())
    ...
end

and then the user can do:

f(x; options=Options(soundspeed=500.0))

to change some value.

Suppose several packages depend on MyPkg. How will you ensure only one of them overloads soundspeed?

1 Like

Right, but in my situation, the fields of Option would not be concrete types. For example, Option would have a dataformat field which needs to be able to hold both Float32, Float16, or Float64.

The other thing is that these are more “global” parameters in the sense that once I change a setting, it needs to apply as default to all functions that use it. Imagine you have lots and lots of functions like f that use soundspeed. I don’t want the user to have to specify their own Option each time they call a function. I want them to set the sound of speed once, and all functions that use this parameter should update.

I did try using a struct like Options. As I mentioned above, If I could make it work, this is a nice… a nice option as it allows me to validate the interdependencies in parameters and disallow the user from using a bad combination of default parameters (like not allowing anythig other than 0 for speedofsound if the we_are_in_outer_space setting is true). This is the setup I tried:

struct Options
	soundspeed::Real
	dataformat::Type{<:AbstractFloat}
	...
	function Options(...)
		... # do your validation checks here and return an Options object
	end
end
Options() = ... # define the default set of settings
const OPTIONS = Ref(Options())
soundspeed() = OPTIONS[].soundspeed

Then, I would still define other functions as

f(x, speed = soundspeed())

And users would just need to update the what OPTIONS holds as in

MyPkg.OPTIONS[] = MyPkg.Options(; soundspeed = 343)

This works fine, except that Test.@inferred soundspeed() was failing, probably because it was using a Ref internally. Though this solves the issue with type piracy I think?

I would document it. Rest is up to users.

Maybe I’m missing your point?

Yes, I understand. This is a downside of the approach. The upside is that it is less prone to the user not knowing what they are using on function call. In my packages I decided for the explicit option passing at the end.

For that just use parametric types:

struct Options{T,DataFormat}
    soundspeed::T
    dataformat::DataFormat
end

(but maybe DataFormat is just what T is here?)

Yes, parametric types would work if I wanted to supply Options to functions individually. I don’t want that. And don’t see if/how parametric types would help me let users declare global parameters/settings. if I were to put it in a Ref, the Ref would still need to hold Options and not Options{x, y, z}.

Imagine Alice writes a package Alef and overloads MyPkg.soundspeed one way. Then Bob writes a package Bet that overloads MyPkg.soundspeed another way. This is critical for how their packages work, the values cannot change.

Now, if I import Alef then Bet, I would get Bet’s value and Alef would break. If I import Bet then Alef, I would get Alef’s value and Bet would break. Type piracy is at best controllable through 1 level of dependency, more than that and things become unknowingly incompatible. Alice and Bob don’t know each other so they never coordinated to avoid incompatibility, and I the user suffer the broken code.

Okay, I just have to avoid importing those packages together, right? Easy to do on a project level, but what if my direct dependencies have those packages as dependencies? Do we keep a combinatorially exploding list of discovered incompatibilities and check Manifest.toml for any pair after I go through the trouble of installing? No, this is why type piracy is discouraged.

4 Likes

Does Preferences.jl solves all these issues? In this case having the parameters stored as global variables in the package will always cause this kind of issue, independently of piracy or not. I do not understand, from the documentation of Preferences.jl, how it handles that, if it does.

I am not dismissing the consequences of type piracy. And I wouldn’t want you to spend your time addressing it on it for this thread unless you think necessary. But if I’m going to ship this thing as an app and maybe even add a GUI, I am not really worried about people using this pakcage alongside others.

That said, I believe my original approach resolves type-piracy concerns:

Options() = ... # define the default set of settings
const OPTIONS = Ref(Options())
soundspeed() = OPTIONS[].soundspeed

Where users would do the following to update settings:

MyPkg.OPTIONS[] = MyPkg.Options(; soundspeed = 343)

Please correct me if I’m wrong.

If this were an app, I wouldn’t expect any interactivity to involve a Julia REPL because importing a package is much easier if I already have a REPL open. I’m guessing you’re trying to say that users would press some buttons that run said code, not type the code themselves? In that case you can design the code how you like, though I would think OPTIONS.soundspeed = 343 is easier to type.

Should document in an introduction that the package is not supposed to be a dependency for other packages because of intentional type piracy and recompilation, that should be clear enough to stop anyone from misusing the package down the line.

I think you need to clarify whether your package is a library or an application, by which I mean:

  • A library is a package which you expect other packages to depend upon.
  • An application is a package which you don’t expect other packages to depend upon, for example you only expect the package to be used directly interactively (via a GUI or a notebook or the REPL or a REST API or whatever).

If it’s an application you can do what you like. If letting users redefine dataformat() works for you then go ahead. However you should probably check whether this is “premature optimisation”: do you really need these to be consts? Even paying the price of dynamic dispatch by making the return type setting some Ref{DataType} might not actually be a performance concern if the package is well written to propagate this type information, so that you only pay the dynamic dispatch price once.

If it’s a library, then making these sorts of global settings is generally regarded as bad software design practice. This is because it makes it harder for downstream packages to use your package and know what is going on. For instance if some downstream package assumes soundspeed == 344 but the user set soundspeed == 300 for some reason then there’s nothing the package can do about that other than override the user’s setting. It’s a great way to silently get incorrect results.

Instead it’s generally best to pass these settings explicitly, such as via an Options parameter suggested above. This lets downstream packages set settings, so they know what they’re getting. There’s no “spooky action at a distance” from some user settings.

Specifically in the case of passing a type parameter to control return types, for most functions you can usually just rely on inferring the return type from the types of the arguments. If the user passes Float32s then return Float32s etc. Use promotion to handle mixed inputs. This usually means the user only needs to select what precision they want right at the start when setting up their data, and the rest is inferred.

4 Likes

Maybe there’s a middle-ground, where “library-style” authors are adviced to never change the default global definitions and pass everything as a parameter, but a “user” can modify the global settings:

julia> global_options = Dict(:soundspeed => 334.0, :lightspeed => 3.0e5)
Dict{Symbol, Float64} with 2 entries:
  :lightspeed => 300000.0
  :soundspeed => 334.0

julia> @kwdef struct Options{T}
           soundspeed::T = global_options[:soundspeed] 
           lightspeed::T = global_options[:lightspeed]
       end

julia> f(; options=Options()) = options.soundspeed, options.lightspeed
f (generic function with 1 method)

julia> f()
(334.0, 300000.0)

julia> global_options[:soundspeed] = 200.0
200.0

julia> f()
(200.0, 300000.0)

julia> f(; options=Options(soundspeed=100.0))
(100.0, 300000.0)

(I’m not sure though if the Dict and the struct aren’t just redundant in this case, and the dict could be passed directly the function instead)

I think dictionaries are just another layer on top of the Settings object that holds the information. And users can add keys to the dictionary that may never get used.

I don’t know that it is as much an issue of how to store the information as it is of type inference and constant propagation.

struct Settings
    datatype::Type{<:AbstractFloat}
    function Settings(datatype::Type{<:AbstractFloat})
    	@assert datatype in [Float64, Float32, Float16]

    	new(datatype)
    end
end
Settings() = Settings(Float64)
const settings = Ref(Settings())
datatype_unstable() = settings[].datatype
datatype_stable() = Settings().datatype
@code_typed datatype_stable()    # we get type inference and constant propagation
# CodeInfo(
# 1 ─     return Float64
# ) => Type{Float64}
@code_typed datatype_unstable()  # we lose type inference and constant propagation
# CodeInfo(
# 1 ─ %1 = Main.settings::Base.RefValue{Settings}
# │   %2 = Base.getfield(%1, :x)::Settings
# │   %3 = Base.getfield(%2, :datatype)::Type{<:AbstractFloat}
# └──      return %3
# ) => Type{<:AbstractFloat}

In the datatype_unstable implementation, if I allow the user to just change what is stored in the settings Ref, we can be sure that the settings are valid because the Setting constructor does the validation (for example they cannot pass a BigFloat). But we lose type inference and constant propagation.

The datatype_stable() implementation offers type stability and constant propagation. But if I allow users to edit the function directly, they could do datatype_stable() = BigFloat. That would require some other validate_constants function to be run manually. And it’s also type piracy, as was pointed out above.

Is it possible that, even if there are inference issues somewhere, the performance concerns are miniscule anyway? I can’t imagine this impacting speed so much if you were careful to use e.g. function barriers

It is possible indeed. I haven’t looked into performance too much in my own project beyond trying to follow the basic guidelines. But this was looking like an issue where you can have either stability or flexibility but not both. I was just trying to think through that.