Overriding Rand.rand for vectorization

Ian_L · August 30, 2023, 4:04pm

Hello. I have followed the advice from Random Numbers · The Julia Language to create a custom sampler for my struct. This is good first step, but I would like to modify the sampler a bit. Currently the sampler takes tree-like objects whose leaves denote random variables and produces dictionaries mapping these random variables to values. For example

rand(t::SumSPE) -> Dict{Symbol, Float64}(:x=>1.0, :y=>2.0)

However, I often use rand to draw n dictionaries. Random.jl then produces a Vector{Dict{Symbol, Float64}} which is expected, but not exactly what I want. Since all the keys are fixed, I would love to have something of the form Dict{Symbol, Vector{Float64}} to create a vectorized version of the dictionary. Is there an idiomatic way of doing this?

Sukera · August 30, 2023, 4:23pm

Not sure I follow exactly, but rand doesn’t produce objects other than vectors in regular use of the API. Could you share an example of ehat you’re doing now, so it’s easier to follow?

mikmoore · August 30, 2023, 4:38pm

If I understand your intent correctly, then such an output would not be inter-operable with the use of Base.rand in other contexts, at which point there’s very little reason to use Base.rand for this. You might consider just making a new function with your desired functionality and using that instead.

Ian_L · August 30, 2023, 5:05pm

Not sure I follow exactly, but rand doesn’t produce objects other than vectors in regular use of the API. Could you share an example of ehat you’re doing now, so it’s easier to follow?

Here is an example. A ContinuousLeaf performs some minor computation, but at the very end calls a distribution from Distributions.jl (e.g. `Normal).

Base.eltype(::Type{<:ContinuousLeaf}) = Dict{Symbol,Float64}

function Random.rand(rng::AbstractRNG, d::Random.SamplerTrivial{T}) where {T<:ContinuousLeaf}
        leaf = d[]
        Dict(symbol(leaf) => rand(rng, leaf.dist))
end

leaf = ContinuousLeaf(:x, Normal(0,1), ...)
rand(leaf)
# Dict{Symbol, Float64}(:x=>0.0)
rand(leaf, 2)
# Vector{Dict{Symbol, Float64}}[Dict{Symbol, Float64}(:x=>0.0), Dict{Symbol, Float64}(:x=>1.0)]

The second call to rand produces a vector of dictionaries. Ideally, the call rand(leaf, n) would instead return a struct-of-arrays representation. Ex:

Dict{Symbol, Float64}(:x=>[0.0, 1.0])

There is a subsection on scalar vs array generation: Random Numbers · The Julia Language. Do you think it could be useful?

If I understand your intent correctly, then such an output would not be inter-operable with the use of Base.rand in other contexts, at which point there’s very little reason to use Base.rand for this. You might consider just making a new function with your desired functionality and using that instead.

Yea that might be best. Although how bad would interoperability break if I did override rand for these specific structs and only make such calls inside my module? It doesn’t seem like a huge problem right?

Sukera · August 30, 2023, 5:31pm

Breaking expectations of widely used library functions is rarely, if ever, a good idea.

mikmoore · August 30, 2023, 6:02pm

This overload is largely harmless but also largely useless.

You’re defining this for your own sampler type, so this is not an instance of type piracy. I.e., nothing that anybody has written without knowledge or use of your code will break. That’s why it’s harmless.

However, this has “violated” the output interface that every “normal” use of Base.rand has assumed, so you should not expect to be able to pass this sampler to other code that has a “normal” use and have it work properly (if at all). For example, you can’t pass this particular sampler to, e.g., some randomized linear algebra routine that attempts to call Base.rand(rng, sampler, M, N) (forgive me if the syntax there is slightly wrong – I haven’t experimented with custom samplers) to produce a random Matrix that it then uses for some calculation.

The main reason to add methods to existing functions is to orthogonalize algorithms from data types. I.e., the same calculation can be used to compute the exp of a Float32 or a square Matrix{T} for a wide range of suitable T (although in practice there are different optimizations and tradeoffs made, which is why we define specializations for both): all that is required is that + and * have suitable and “equivalent” definitions for either type.

But since the array output of your proposed sampler is nonstandard, it’s not an “equivalent” use. Other code has made assumptions about the output of Base.rand (i.e., that it can be used to produce an Array of values but not a Dict of Arrays) will be violated and the following code will be unlikely to work.

That is why I say that there isn’t a benefit to specializing Base.rand for this – existing uses of Base.rand should not be expected to function for this. This is why I say that such an overload is largely “useless” and I suggest a different function altogether. With a different function, you’ll be less likely to confuse yourself (or other people) thinking that this would work with standard Base.rand uses.

Since you say you’re only planning to call this within your module, there isn’t a operational reason to use Base.rand over some new function. If it’s still useful to define the sampler this way because the other machinery in the RNG interface saves you needing to re-write a bunch of extra boilerplate, go ahead and save yourself the trouble. But if you’re needing to re-implement most everything anyway, then Base.rand isn’t doing you any favors and is risking semantic confusion.

Topic		Replies	Views
Can rand generate random vectors with custom type entries? General Usage random	5	497	December 2, 2023
Required methods to sample a multivariate distribution Statistics distributions	1	298	October 26, 2023
Rand and Sampler interface for multiple samples General Usage distributions , random	2	552	August 21, 2021
Sample StaticVectors with Distributions.jl General Usage distributions , staticarrays	2	102	December 13, 2024
Custom Sampler with multiple parameters General Usage question , random	4	259	November 23, 2023

Overriding Rand.rand for vectorization

Related topics