Generating random functions (as opposed to random numbers)

I need to write a single-argument function f(r::Number) that returns a different random number for different values of r, but always the same number for the same r. Same r is meant in the sense of isequal or ==, it doesn’t matter much, so if r is a Float64, say, f(r) will be “discontinuous at every point”. I can imagine hacky ways that convert r to an integer based on its bit representation, and then use that as seed, but that’s far from efficient (because building/updating a generator for each call is expensive). I cannot seem to find a more performant/elegant way. Any ideas?

You could go with a caching approach perhaps:

julia> let cache = Dict()
           global f(r::Number) = get!(randn, cache, r)
       end
f (generic function with 1 method)

julia> f(1)
-0.4507461071970271

julia> f(1)
-0.4507461071970271

julia> f(2.5)
0.7257163824509684

julia> f(2.5)
0.7257163824509684
2 Likes

What’s the application? One alternative would be a smoothed random function like a smooth random walk, which could be implemented in ApproxFun.jl similar to how it works in chebfun.

2 Likes

@mike , thanks, that’s a great idea. It actually works better than I expected, but I worry that the cache grows too much after using the function for a while. What I actually had in mind is more computational, mapping information about r, like its hash, to a computed random number, but I might eventually go with your approach.

@stevenj, this is for implementing a reproducible disorder realization on a lattice. The user designs a lattice and asks for a spatially uncorrelated potential to apply to it. If I understand correctly, you propose to somehow define some random spatial harmonics or coefficient and build the function by expanding it to a finite order in some function basis using those coefficients? In this sense I’d say @mike’s approach is better for my purposes, because you don’t need to worry about spatial correlations (due to too few harmonics), and coefficients are analogous to his cache, but of fixed size. Or maybe I misunderstood?

Yeah, agreed, it’s unlikely to be sustainable for a long running f.

Could combine the hashing idea with an rng that you just reseed based on the hash of the number:

julia> using Random

julia> let rng = MersenneTwister()
           global f(r::Number) = (Random.seed!(rng, hash(r)); randn(rng))
       end
f (generic function with 1 method)

julia> f(1)
-0.7338017765779451

julia> f(1)
-0.7338017765779451

julia> f(1.1)
-1.1040149708608498

julia> f(1.1)
-1.1040149708608498
1 Like

That’s just what I tried first. However it is pretty slow

julia> @btime f(2.3)
  10.648 μs (4 allocations: 152 bytes)
-0.5152781040212041

You may have more luck with some of the rngs from packages. RandomNumbers has the following:

julia> let rng = RandomNumbers.Xorshifts.Xoroshiro128()
           global g(r::Number) = (Random.seed!(rng, hash(r)); randn(rng))
       end
g (generic function with 1 method)

julia> g(1)
1.1158401688820527

julia> g(1)
1.1158401688820527

julia> @benchmark f(1)
BenchmarkTools.Trial:
  memory estimate:  152 bytes
  allocs estimate:  4
  --------------
  minimum time:     9.860 μs (0.00% GC)
  median time:      9.950 μs (0.00% GC)
  mean time:        10.096 μs (0.00% GC)
  maximum time:     26.180 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark g(1)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     7.708 ns (0.00% GC)
  median time:      7.768 ns (0.00% GC)
  mean time:        7.889 ns (0.00% GC)
  maximum time:     24.985 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

Though I’m not too familiar with the particulars of that generator…

1 Like

I just came up with this. Is hash statistically equivalent to rand?

julia> f(r::Number) = hash(r)/typemax(UInt)

julia> @btime f(2.3)
  0.033 ns (0 allocations: 0 bytes)
0.664561538429725

Oh! Your Xoroshiro128 is very nice! And you can use whatever distribution you need.

That timing means that your computation got constant folded and no work was actually done at runtime. You can do

julia> @btime f($(Ref(2.3))[])
  3.355 ns (0 allocations: 0 bytes)
0.664561538429725

Also hashes can have collisions.

1 Like

Hi @kristoffer.carlsson, yes, I’m aware about the timing thing. About the collisions, is hash() any worse than rand() in regards to getting the same solution twice? And do you know if hashes are uniformly distributed in the whole UInt range? In any case, I really like @mike’s last solution, I think I’ll mark that as the answer.

It seems that, ideally, yes.

But “Note that this criterion only requires the value to be uniformly distributed , not random in any sense. A good randomizing function is (barring computational efficiency concerns) generally a good choice as a hash function, but the converse need not be true.”

Sounds like a hash function is what you need. There are lots around, so maybe the built-in is not perfect but others should be good. I think the cryptographic hash function address your concern above.

"a small change to a message should change the hash value so extensively that a new hash value appears uncorrelated with the old hash value "

3 Likes

hash provides no statistical guarantees.

2 Likes

Yes, hashing a seed seems to be the way to go, especially as you can prescribe the marginal distribution (Uniform, Normal, other) this way and get the statistical properties. You can even split the hash into some bits for the seed and some bits to discard the first n samples of rand if you distrust the statistical properties of the hash.

2 Likes