`rand(::MyType, N)` allocates a lot, how to define return type properly?

Apologies if this has been discussed before but I am a bit lost in the docs and examples out there regarding a type-safe implementation of a proper rand interface. I guess it’s OK to have yet another another topic on this :wink:

We are

using Random

I simplified my use-case a lot. Let’s say we have a parametric type which can act as a random number generator by implementing (for the sake of simplicity, let’s have these two methods)

struct Foo{T<:AbstractFloat}
   x::T
end

(foo::Foo)() = rand() * foo.x
(foo::Foo)(rng::AbstractRNG) = rand(rng) * foo.x

It works nicely:

julia> foo = Foo(23.5)
Foo{Float64}(23.5)

julia> foo()
14.45342364543

But still does not work within the rand universe:

julia> rand(foo)
ERROR: MethodError: no method matching Random.Sampler(::Type{TaskLocalRNG}, ::Random.SamplerTrivial{Foo{Float64}, Any}, ::Val{1})
This error has been manually thrown, explicitly, so the method may exist but be intentionally marked as unimplemented.

To get this working, the simplest possible way is probably this method:

Random.rand(rng::AbstractRNG, foo::Random.SamplerTrivial{Foo{T}}) where {T} = foo[](rng)

which now allows

julia> rand(foo)
4.275336808366797

julia> rand(foo, 5)
5-element Vector{Any}:
 20.016527280057733
 23.000581372857162
 12.988361756507565
  2.0685534186007852
  2.333543063789338

Long story short, it allocates a lot because rand cannot figure out the eltype of my sampler, therefore we get a Vector{Any} although foo() always returns a Float64. In my use case, it’s of a parametric type btw.

Benchmarking, just for the numbers:

julia> @benchmark rand(foo, 1_000_000)
BenchmarkTools.Trial: 781 samples with 1 evaluation per sample.
 Range (min … max):  4.475 ms … 64.231 ms  β”Š GC (min … max):  0.00% … 92.56%
 Time  (median):     4.977 ms              β”Š GC (median):     0.00%
 Time  (mean Β± Οƒ):   6.378 ms Β±  3.206 ms  β”Š GC (mean Β± Οƒ):  22.94% Β± 21.63%

  β–β–ˆβ–‡β–‚β–β–‚β–„β–‚    β–‚β–ƒ                                              
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–…β–†β–ˆβ–ˆβ–ˆβ–ˆβ–…β–‡β–ˆβ–ˆβ–‡β–‡β–†β–‡β–…β–‡β–…β–‡β–†β–†β–‡β–‡β–ˆβ–†β–†β–‡β–ˆβ–‡β–…β–†β–…β–…β–†β–‡β–…β–†β–β–‡β–β–„β–†β–†β–β–†β–†β–„β–…β–„ β–‡
  4.48 ms      Histogram: log(frequency) by time     14.3 ms <

 Memory estimate: 22.92 MiB, allocs estimate: 1000003.

I went through Random Numbers Β· The Julia Language and tons of trials and errors (even managed to crash Julia a few times) but could not get it right.

At some point I thought I understood everything and this should work, but it does not :laughing: It still gives me Vector{Any}

julia> Random.Sampler(::Type{<:AbstractRNG}, foo::Foo, ::Val{1}) =
           Random.SamplerTrivial(foo, Float64)

julia> Random.rand(rng::AbstractRNG, s::Random.SamplerTrivial{Foo,Float64}) = s[]().x

julia> rand(foo, 5)
5-element Vector{Any}:
  9.614490452513847
 13.571465906535863
  8.647284173324959
 14.56590026126972
 15.010035608615501

I even tried with Val{Inf} but no success

julia> Random.Sampler(::Type{<:AbstractRNG}, foo::Foo, ::Val{Inf}) =
           Random.SamplerTrivial(foo, Float64)

julia> rand(foo, 5)
ERROR: MethodError: no method matching Random.SamplerTrivial(::Foo{Float64}, ::Type{Float64})
The type `Random.SamplerTrivial` exists, but no method is defined for this combination of argument types when trying to construct it.

What is the currently recommended way to define this correctly? The Random-docs mention here and there that some interfaces are not stable so I don’t want to mess around too much. Btw. my sampler return type is a parametric one, but once I am on the right track, it should be easy I think :wink:

Here is a workaround (3 allocs for 1000000 samples) but of course it’s not as nice as being embedded into the rand universe with all the nice features:

julia> function (foo::Foo{T})(n::Integer) where T
          out = Vector{T}(undef, n)
          for idx in eachindex(out)
              out[idx] = rand(foo)
          end
          out
      end

julia> foo(5)
5-element Vector{Float64}:
 8.170854883533778
17.33952422420556
 4.586268393336077
16.44952665515737
20.909410249922733

julia> @benchmark foo(1_000_000)
BenchmarkTools.Trial: 2981 samples with 1 evaluation per sample.
Range (min … max):  1.404 ms …   3.581 ms  β”Š GC (min … max):  0.00% … 53.10%
Time  (median):     1.494 ms               β”Š GC (median):     0.00%
Time  (mean Β± Οƒ):   1.676 ms Β± 326.646 ΞΌs  β”Š GC (mean Β± Οƒ):  11.49% Β± 13.59%

 β–…β–‡β–ˆβ–ˆβ–‡β–…β–ƒβ–‚β–          ▄▅▅▄▂▁▁▁ ▁    ▂▂▁▁ ▁▁▃▂▁▁      ▁         β–‚
 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–†β–†β–†β–„β–†β–„β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–†β–‡β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–†β–…β–β–…β–ˆβ–ˆβ–ˆβ–†β–†β–†β–„β–„β–β–† β–ˆ
 1.4 ms       Histogram: log(frequency) by time      2.69 ms <

Memory estimate: 7.66 MiB, allocs estimate: 3.

HI
About why Vector{Any} is returned described in Random Numbers Β· The Julia Language. To return Vector{Float64} needs additionally to define:

 Base.eltype(::Type{Foo}) = Float64

or maybe Foo’s T parameter type if not only Float64 required.
I’m not a big expert on the random samplers interface, so maybe someone can suggest a better option for cumstom number generator overall.

Also, it is better interpolate values inside benchmarks macros something like:

@benchmark rand($foo, $1_000_000)

to avoid false times and allocations not caused by the function.

4 Likes

Yes @xinady’s answer is correct, see Random Numbers Β· The Julia Language. I’m just adding a tiny precision: as calling your Foo object returns a float which is the result of multiplying T with Float64, perhaps you can use promote_type, like in

Base.eltype(::Type{Foo{T}}) where {T} = promote_type(T, Float64)
2 Likes

Damn, I overlooked the eltype line. Thanks! I will check that out. My case is a bit more complicated (multiple parameters).

Yes, eltype does the trick, very naturally :wink:

1 Like