Numbers sampled from exponential distribution seems to be incorrect

I am running a simulation task which is supposed to sample data from an exponential distribution, but the results are always larger than correct ones. I converted my julia code to matlab and it produced correct results (without any logic and algorithm changes).

I checked my code until I found something weird similar to this, I wonder if this is a bug in Distributions.jl:

histogram(rand(Exponential(1e5),1000000),bins=256)

When running this line of code repeatedly in the REPL, I found that it displayed significantly 2 different histograms, and the incorrect one was clearer when I combined them into one plot (see below).

Both p (correct) and q (incorrect) are sampled this way and plotted using histogram(), and are 1000000-element vector:

julia> q = rand(Exponential(1e5),1000000) 

julia> histogram(q, bins=256, label="q") # after trying several times to find incorrect sample
...
julia> p = rand(Exponential(1e5),1000000)

julia> histogram!(p, bins=256, label="p")

Both p and q are 1000000-element vector:

julia> q
1000000-element Vector{Float64}:
  36615.51140190222
      ⋮
  76827.76195613605

julia> p
1000000-element Vector{Float64}:
  95336.41621794412
      ⋮
 160348.4912588241

Here is the corresponding matlab code and result:

>> histogram(exprnd(1e5,1000000,1),256)

It is worth noting that the julia sample results seem to be too “stable” compared to matlab. (I find it difficult to explain, but you might understand this after running such code both in julia and matlab. Anyway, this is not the core question I want to ask.)

julia version: v1.7.2
package version: Distributions.jl v0.25.58

1 Like

I think the different shape is just due to different maximum values in the samples and therefore different bin sizes. Just try

q = rand(Exponential(1e5),1000000) 
histogram(q, bins=range(0.0, stop = 1.5E6, length = 256), label="q")
1 Like

Can’t replicate.
p and q look (plot) the same on my machine:


julia> using Distributions, Plots, Statistics

julia> q = rand(Exponential(1e5),1000000)
1000000-element Vector{Float64}:
  53746.7089587224
  25558.058373069543
  21390.660786270604
      ⋮
 194671.77955269578
   3261.280286248243

julia> histogram(q, bins=256, label="q") # after trying several times to find incorrect sample

julia> p = rand(Exponential(1e5),1000000)
1000000-element Vector{Float64}:
  86643.34310962744
  18451.649764193146
 229530.047033587
      ⋮
  39726.18702750079
  47451.27464180034

julia> histogram!(p, bins=256, label="p")

julia> mean(q)
99881.81845107871

julia> mean(p)
100103.12791672003

(@v1.7) pkg> status
Status ~/.julia/environments/v1.7/Project.toml
[024491cd] BetaML v0.6.0 ~/.julia/dev/BetaML
[a93c6f00] DataFrames v1.3.2
[31c24e10] Distributions v0.25.58
[e30172f5] Documenter v0.27.15 ~/.julia/dev/Documenter
[91a5bcdd] Plots v1.29.0
[295af30f] Revise v3.3.3

Thank you, I understood.

Thanks, I got the causes.