How to generate reproducible random numbers across versions via package manifest?

When we want to reproduce results down to slightest difference -which sometimes is critical, the only guaranteed solution I am aware of is to containerize the whole environment; versioning Julia and its RNG’s might not be enough.

Exact binary reproducibility is usually impossible for any nontrivial calculation (results can change on a different machine, even if you are using the same version of Julia and all packages).

Fortunately, it is relatively rarely required since (as @Sukera suggested) if your results are very sensitive to a particular set of random draws, then reproducibility becomes the least of your worries.

Generally it is already great if

  1. your code remains runnable (and that’s a big challenge, a lot of environments become hard to reproduce in 5–10 years)
  2. running it gets you results reasonably close to what appeared in the article.

Unfortunately, in some fields vast majority of papers do not come even nearly close to this.

3 Likes

Sorry, my mistake. I misread the direction of the inequality in your post. Yes, it makes sense to me that one should not be able to instantiate a manifest created by a newer version in an older version. I thought you were saying the opposite.



Thanks, I think that makes sense now. So the purpose of [compat] is to not allow me to using a package unless I have the right versions? (Which is distinct from my intention here of ensuring the same packages are being loaded via ]instantiate.)



Here it is:

(TestingVersions) pkg> status
     Project TestingVersions v1.0.0
      Status `<redacted>/TestingVersions2/Project.toml`
  [a93c6f00] DataFrames v0.20.2


Ok, maybe exact is too extreme of a word to use here.
Let me give you a statistical example that demonstrates the type of reproducibility that I desire.
Here’s a small Monte Carlo simulation to compute the standard deviation of the sample mean of 100 draws from a standard normal random variable.
The standard asymptotic approximation says that it should be approximately .10 = 1/\sqrt{100}.

using Random
using Statistics

M = 2000
N = 100
results = fill(NaN, M)
for s in [1,2]
    Random.seed!(s)
    for m = 1:M
        results[m] = mean(randn(N))
    end
    println(std(results))
end

yields on Julia 1.6:

0.09878139709623658
0.09934571969978735

If I am reporting a table with precision up to the third or fourth digit, then the results “differ” depending on what the seed is. Yet that’s no reason to say that the theory is “very sensitive.” It’s an approximation. It’s used throughout statistics and science more generally. Nor does this have to do with exact binary reproducibility.

Is this important? It’s important to me because some of the journals I am submitting to require this type of reproducibility now. That is, they want the exact same tables/figures to be produced when the code is run. They don’t want to have to use their judgment on whether .0987 is “close” to .0993 or not. My understanding was that Julia’s environment system was designed to help guarantee this type of reproducibility.

Certainly, but most real-life code contains branches and other constructs which effectively make your outcome sensitive to last bits of floats in key places, which are bound to matter with enough samples and operations and lead to “large”, discontinuous changes in the result.

Even seemingly continuous special functions have branches in their implementation. That said, if you are lucky and can guarantee that this does not happen, then the kind of reproducibility you are aiming for is possible. But this is very rare in practice.

I see. So it sounds like the only way to ensure the type of reproducibility I am aiming for (again, not something I came up with, but requested by journals) is to indicate the specific version of Julia that the results were generated with.

1 Like

Technically for full reproducibile, you might want to specify the operating system. This generally isn’t needed, but there are some edge cases where it affects things (mainly with C libraries that think Int is 32 bit on windows)

1 Like

Right, good point. So I guess I should really specify the binary?
For example:
https://julialang-s3.julialang.org/bin/linux/x64/1.1/julia-1.1.1-linux-x86_64.tar.gz

Actually, that raises another question.
On the older versions download page (Julia Downloads (Old releases)) I only see 1.1.1, whereas I wrote my code in 1.1.0.
Is it safe to assume that any changes between 1.1.0 and 1.1.1 are sufficiently minor not to jeopardize the type of reproducibility I want?
(I guess I can always try it to be sure.)

I don’t think anything should change, but you definitely should double check. Also, I just remembered that cpu instruction set matters since functions like sum (or anything in BLAS) will do the operations in a different order to take advantage of avx instructions. Also, threading and using GPUs are probably impossible if you want exact reproducibility.