Hi, my problem is simple: I am currently playing around with Multi-Armed Bandits and long story short in every iteration step I need to sample 10 iid. random variables (one for each arm).

I noticed that it takes significantly longer to sample this small vector length repeatedly instead of using a longer array:

```
using BenchmarkTools, Random
a = zeros(10000)
@btime randn!(a) # 14.500 μs
b = zeros(10)
foo(a) = for _ in 1:1000 randn!(b) end
@btime foo(a) # 53.125 μs
```

My (uneducated) guess is that this results from SIMD/AVX optimizations. Therefore I would like to have a some kind of workaround/sampler/generator which allows me to use `rand!`

just as before but internally samples bigger vector sizes. My first naive attempt looks like this:

```
using Distributions
struct RandomGenerator{T<:AbstractVector}
state
count
sampler
preallocation::T
RandomGenerator(sampler, count=256) = new{Vector{Float64}}(
Ref(1), count, sampler, rand(sampler, count)
)
end
function rand!(x::RandomGenerator, dest::T) where {T<:AbstractVector}
i = 1
len = length(dest)
while i < len
max_length = min(x.count-x.state[]+1, len-i)
dest[i:i+max_length] = x.preallocation[x.state[]:x.state[]+max_length]
i += max_length
x.state[] += max_length
if x.state[] >= x.count[]
x.state[] = 1
rand!(x.sampler, x.preallocation)
end
end
end
gen = RandomGenerator(Normal())
results = zeros(10)
rand!(gen, results)
```

I am just wondering whether this can be done conceptually smarter or more idiomatic, respectively (e.g. with an `Iterable`

interface).