That’s just how it works right?.. Trading space for speed. The iterator will necessarily call rand() to minimize allocation… You can’t not allocate but also want batch rand(1000)…
The extent of this would depend on which version you are in. In Julia 1.6, the dicrepancy would mostly be fixed by passing an explicit rng to rand in the generator version. But you your timings make me guess you are on 1.7. There I don’t think you can easily beat the array version, because it uses simd, unlike the scalar version. An idea I would want to explore in a package is wrapping Xoshiro in a way that even the scalar version of rand uses simd, by having an internal cache (like what MersenneTwister does).
EDIT: but to your question about having a rand iterator, you can check out Rand() from the RandomExtensions package. It won’t help performance in this instance, but as it hanldes the Sampler thing in the same way as in arrays, it can occasionally help speed-up things where a “pre-computation” can be shared between multiple calls of rand.
I see that RandomExtensions is experimental, but it looks cool. I don’t know anything about pseudo-random number generation algorithms, so I wasn’t sure if there was some conceptual reason why one couldn’t make a random number iterator. Whether that iterator can take advantage of SIMD is a separate question, I suppose.