I often encounter a scenario, where I use map but then notice in the body of the closure that I want to throw away some values. I think there’s no simple way to do this. Here’s a mockup version:
itr = 1:100
mapfilter(itr) do i
intermediate_result = first_function(i)
if some_condition(intermediate_result)
return second_function(intermediate_result)
else
return # result is filtered out
end
end
The problem with list comprehensions is that the if condition can’t make use of intermediary values, so I would have to call first_function twice:
[second_function(first_function(i)) for i in itr if some_condition(first_function(i))]
A chained filter and Iterators.map doesn’t work because the state within map is not accessible to filter.
What would be the best way to get a function like this, preferrably with Base methods?
A for loop containimg a call to push! is flexible, clear, and fast.
You can sizehint! the vector you are pushing into up to maximum size it could be before you start.
That is true as long as it’s easy enough to specify the return type of the vector without running your function. It might be just Int but it might be Horrible{Type{With{Many{Parameters}}}}. Using map or list comprehensions thankfully spares me from doing that.
Taking just the first value to get the type wouldn’t work in type unstable scenarios, even if those should generally be avoided of course.
You can do it by using iterate protocol with something like this:
struct MapFilter{F1, F2, T}
f::F1
cond::F2
x::T
end
mapfilter(f, cond, x) = MapFilter(f, cond, x)
function Base.iterate(x::MapFilter, state = iterate(x.x))
while true
state === nothing && return nothing
val, id = state
state = iterate(x.x, id)
y = x.f(val)
x.cond(y) && return (y, state)
end
end
function _collect(mf, state, out)
for x in Iterators.rest(mf, state)
push!(out, x)
end
return out
end
function Base.collect(mf::MapFilter)
peel = iterate(mf)
peel === nothing && return nothing
val, state = peel
out = [val]
_collect(mf, state, out)
end
julia> collect(mapfilter(x -> x^2, x -> x < 10, [1, 2, 3, 4, 3, 5, 1]))
5-element Vector{Int64}:
1
4
9
9
1
julia> collect(mapfilter(x -> x^2, x -> x < 10, [4, 5]))
The issue is (I think) that a priori the type of the collection can’t be known because it may depend on the filter function. For example, a Vector{Union{T, Missing}} may, after your mapfilter, collapse into a Vector{T} if your filter is ismissing. Unless mapfilter is specialised on that, it can’t infer the return type correctly (or has to guess and widen, like filter or map does currently I think).