Is there an idiom for "do this N times"?

I am performing some simulations where a statistical model is used first to simulate a new response then to fit the response then to extract quantities of interest. The current code is

function parametricbootstrap(N::Integer, m::LinearMixedModel{T},
        rng::AbstractRNG=Random.GLOBAL_RNG, props=(:objective, :σ, :β, :θ),
        β::Vector{T} = m.β, σ::T = m.σ, θ::Vector{T} = m.θ) where {T}
    y₀ = copy(response(m))          # to restore original state of m
    n, p, q, nre = size(m)
    length(β) == p && length(θ) == (k = length(getθ(m))) || throw(DimensionMismatch(""))
    baseval = getproperty.(Ref(m), props)
    ptype = typeof(baseval)
    val = [NamedTuple{props, ptype}(
        getproperty.(Ref(refit!(simulate!(rng, m, β=β, σ=σ, θ=θ))), props)) for _ in Base.OneTo(N)]
    refit!(m, y₀)                   # restore original state
    val
end

is for _ in Base.OneTo(N) a good way of phrasing “do this N times” or are there better alternatives?

I don’t know if this is necessarily better but I like to use ntuple(_ -> f(x), N) for this

1 Like

Note that the ntulple solution will be type-unstable (and therefore potentially slower) if the value of N is not known to the compiler. So if N is a literal or a constant, it’s probably ok, but if N is a variable then the OneTo(N) approach is better.

3 Likes

I think it’s fine, but for long expressions like this, I find that having 1:N at the very end reduces readability – it’s off the screen in your code above for example.

There are several other ways of doing this, one idiom that I often use is:

val = 1:N .|>_-> NamedTuple{props, ptype}(...)

However, for readers not familiar with the broadcasted pipe, this probably looks confusing.

2 Likes

Don’t bother with OneTo in this case; using 1:N is surely going to give you exactly the same performance.

There is also the foreach function which you can use if you want to discard your results.

Since you’re already broadcasting in this case, you may be able to structure your broadcast such that it does the repetition for you — possibly using Iterators.repeated or a FillArray to define the resulting shape. I haven’t totally deciphered your code to see how/if that’s possible in this case, though.

2 Likes

The usage of the comprehension syntax looks fine to me. But if you want to try something else:

val = map(1:N) do _
    NamedTuple{props, ptype}(...)
end

In many cases, this could be a great solution. But I think this is not good for the case here because the computation involves side-effect (presumably due to rng). I suppose it’s better to use broadcasting only for “pure” functions?

1 Like

I didn’t know this. Thank you!

I love writing things like A .= rand.(). While we don’t specify the order of execution in a broadcast, we do say that if the rand.() call is dotted then it appears in the “kernel” and will be executed every loop.

3 Likes

I typically use comprehension as well but what’s the difference between 1:N and Base.OneTo(N)?

Base.OneTo(N) statically encodes the fact that it’s starting at 1 within the type itself. This was needed to eek the very last bit of performance out of array iteration and axes computations and it now distinguishes one-based arrays from offset arrays.

We now have very powerful constant propagation so unless you’re actually dealing with array axes, you don’t need to use Base.OneTo in general code.

2 Likes

Thanks to all. I ended up using a thunk as suggested by @tkf and it works fine.

"""
    parametricbootstrap(rng::AbstractRNG, nsamp::Integer, m::LinearMixedModel,
        properties=(:objective, :σ, :β, :θ); β = m.β, σ = m.σ, θ = m.θ)

Perform `nsamp` parametric bootstrap replication fits of `m`, returning a 
`Vector{NamedTuple}` (a.k.a. `Tables.RowTable`) of `properties` of the refit model.

# Named Arguments

`β`, `σ`, and `θ` are the values of the parameters of `m` used to simulate the responses.
"""
function parametricbootstrap(rng::AbstractRNG, nsamp::Integer, m::LinearMixedModel,
        properties=(:objective, :σ, :β, :θ); β = m.β, σ = m.σ, θ = m.θ)
    y₀ = copy(response(m))          # to restore original state of m
    try
        map(1:nsamp) do _
            refit!(simulate!(rng, m, β=β, σ=σ, θ=θ)) # simulate a response and refit model
            NamedTuple{properties, typeof(getproperty.(Ref(m), properties))}(
                getproperty.(Ref(m), properties))    # extract properties as a named tuple
        end
    finally
        refit!(m, y₀)               # restore original state of m
    end
end

There are many ways to write this but I personally use for _ = 1:n; ...; end since it’s considerably easier on the compiler than idioms that use closures.

3 Likes

Okay, it is cleaner to write a plain old for loop.

function pbootstrap(rng::AbstractRNG, nsamp::Integer, m::LinearMixedModel,
        properties=(:objective, :σ, :β, :θ); β = m.β, σ = m.σ, θ = m.θ)
    y₀ = copy(response(m))  # to restore original state of m
    vtype = NamedTuple{properties, typeof(getproperty.(Ref(m), properties))}
    value = Vector{vtype}(undef, nsamp)
    @inbounds for i in 1:nsamp
        refit!(simulate!(rng, m, β=β, σ=σ, θ=θ))
        value[i] = vtype(getproperty.(Ref(m), properties))
    end
    refit!(m, y₀)           # restore original state of m
    value
end

I want you to know that my barista asked me what I was up to today and I proudly told her I would be writing a thunk. Now I have to go back and tell her that I ended up with a sorry old “for loop”. My reputation is going to take a hit.

4 Likes

Not sure you need the explicit vtype here, as the setindex! should take care of the conversion.

:wink:

Without that explicit vtype I got

ERROR: MethodError: Cannot `convert` an object of type Tuple{Float64,Float64,Array{Float64,1},Array{Float64,1}} to an object of type NamedTuple{(:objective, :σ, :β, :θ),Tuple{Float64,Float64,Array{Float64,1},Array{Float64,1}}}

in v1.1

Yeah, adding names to a tuple isn’t a conversion… but you can add names like that with the constructor. There are cases where conversion is the same as construction, but this isn’t one.

A post was split to a new topic: Is there an idiom for constructing a heterogeneous tuple?

There is an R function replicate that evaluates an expression n times collecting the results, usually in an array. It is specifically designed for simulations.

That is what I had in mind when I started on this but I think that the looping may be more natural. In R, of course, one tries to avoid explicit loops but in Julia there is no need to “fear the loop”.

Sorry, I missed the wrapping.

Perhaps you could replace

vtype = NamedTuple{properties, typeof(getproperty.(Ref(m), properties))}
value = Vector{vtype}(undef, nsamp)
for i in 1:nsamp
    refit!(simulate!(rng, m, β=β, σ=σ, θ=θ))
    value[i] = vtype(getproperty.(Ref(m), properties))
end

with something like

value = [(refit!(simulate!(rng, m, β=β, σ=σ, θ=θ));
          NamedTuple{properties}(getproperty.(Ref(m), properties)))
         for _ in 1:nsamp]

I would also go with a higher order function here, and not for loop. There are a lot of situations where I think a for loop enhances readability over a functional approach, but this is not one of them. To be specific, I don’t like that you have to specify the type and size of the output vector twice in this code. IMO it leads to a slightly higher maintenance cost and risk of bugs.

2 Likes