Do-block map with zero argument do-block

This section of the Julia manual says that

a plain do would declare that what follows is an anonymous function of the form () -> ...

So I would expect this to work:

julia> map(1:3) do
           rand(10:99)
       end
ERROR: MethodError: no method matching (::var"#21#22")(::Int64)
Closest candidates are:
  #21() at REPL[11]:2
Stacktrace:
 [1] iterate at ./generator.jl:47 [inlined]
 [2] _collect(::UnitRange{Int64}, ::Base.Generator{UnitRange{Int64},var"#21#22"}, ::Base.EltypeUnknown, ::Base.HasShape{1}) at ./array.jl:678
 [3] collect_similar(::UnitRange{Int64}, ::Base.Generator{UnitRange{Int64},var"#21#22"}) at ./array.jl:607
 [4] map(::Function, ::UnitRange{Int64}) at ./abstractarray.jl:2072
 [5] top-level scope at REPL[11]:1

But as you can see, it doesn’t work. Is this a bug, or is this the expected behavior?

Of course, if I use a do _ it works:

julia> map(1:3) do _
           rand(10:99)
       end
3-element Array{Int64,1}:
 85
 19
 94
1 Like

The map function expects a Function that takes a single paremeter, corresponding to an element of the collection it is passed as well. If you pass it an anonymous function with no parameters, it will complain since it will fail to apply your function to the collection’s elements.

Here’s a potential pseudo-implementation of map:

map(f::Function, xs) = [f(x) for x in xs]

Hence, f should be akin to a lambda like (x) -> ..., or, if you prefer the do ... end syntax, it will require an argument, just like you exemplified with _.

In conclusion, that is expected behavior, not a bug.

5 Likes

Ah, yes, I should have checked the equivalent code without a do block:

julia> map(() -> rand(10:99), 1:3)
ERROR: MethodError: no method matching (::var"#5#6")(::Int64)
Closest candidates are:
  #5() at REPL[1]:1
Stacktrace:
 [1] iterate at ./generator.jl:47 [inlined]
 [2] _collect(::UnitRange{Int64}, ::Base.Generator{UnitRange{Int64},var"#5#6"}, ::Base.EltypeUnknown, ::Base.HasShape{1}) at ./array.jl:678
 [3] collect_similar(::UnitRange{Int64}, ::Base.Generator{UnitRange{Int64},var"#5#6"}) at ./array.jl:607
 [4] map(::Function, ::UnitRange{Int64}) at ./abstractarray.jl:2072
 [5] top-level scope at REPL[1]:1

which demonstrates that the do-block is a bit of a red-herring here.

2 Likes

You could imagine defining

import Base.Iterators: repeated
repeated(f::Function,n) = [f() for _=1:n]

and then you get to do

julia> repeated(3) do 
           rand(10:99)
       end
3-element Array{Int64,1}:
 56
 20
 38

Might be worthy of a PR, although you’d have to decide whether its right to return an array, but it seems like it could be a logical place to add something like that to the language.

1 Like

I would also consider just

map(_ -> f(), 1:n)

if one insists on map and f is given. That said, [f() for _ in 1:n] may be the cleanest — I don’t think this is worth extra functions in the API.

1 Like

I would be in favor of having something like

repeated(3) do
    rand(10:99)
end

available in Base. We could follow the map/filter precedent of having Base.repeated return an array and Base.Iterators.repeated return an iterator. However, it might be better to change the name. Base.repeated could be confused with Base.repeat. Also, one could argue for extending fill instead of creating Base.repeated, since fill is the array-returning analogue of Iterators.repeated. However, that would be a breaking change, since calling fill on a function currently returns an array filled with that function:

julia> fill(rand, 3)
3-element Array{typeof(rand),1}:
 rand
 rand
 rand

So, perhaps a new name is in order, like replicate.

I definitely use [f() for _ in 1:n] sometimes, but in the most recent case that led to my question above, f was a multiline function, so I needed a do block. My code looked something like this:

map(1:repeats) do _
    X_shuffle = shuffle_col(X, j)
    ŷ = predict(mach, X_shuffle)
    measure(ŷ, y)
end
1 Like

Personally I am not a huge fan of do, so I would go for

function _f() # give _f a descriptive name if feasible!
    X_shuffle = shuffle_col(X, j)
    ŷ = predict(mach, X_shuffle)
    measure(ŷ, y)
end
[_f() for _ in 1:repeats]

in a local scope, but FWIW, I think your solution above is fine.

Generally, I don’t see anything wrong with naming a method that does something nontrivial, even if used locally, as a closure.

1 Like