Define map, filter, ... also as functionals

Hey all,
I like the pipelining syntax, but I often encounter the problem that one has to write code like this:

rand(10) |> x -> filter(<=(0.5), x)

I see that map, filter, accumulate, findfirst etc. all take a minimum of two arguments. If I define
Base.filter(f::Function) = a -> filter(f,a) the above code can be written as:

rand(10) |> filter(<=(0.5))

My question is if there is a reason not to implement these extensions and if there are better (existing) alternatives.

1 Like

Yes, there are quite a few, e.g.

1 Like

Transducers.jl provide that functionality:

using Transducers

1:10 |> Filter(iseven) |> Map(sqrt) |> collect

Transducers have been introduced by Rich Hickey in Clojure and decouple the desired transformations, i.e., Map, Filter and so on, from the actual traversal (that’s why an explicit collect is needed above). Thereby the same transducer can be used across different data structures[1] and do not create intermediate data structures, i.e.,

1:10 |> Filter(iseven) |> Map(sqrt) |> foldxl(+)

does not allocate.


  1. Don’t know if Julia implements them for anything besides Iterators though. Clojure has some more options in this respect and supports transducers on channels for instance. ↩︎

4 Likes

But why not add the simple behaviour case to the standard library?

1 Like

Because that already has a meaning, so changing that would be breaking:

julia> foo() = println("hello")
foo (generic function with 1 method)

julia> map(foo)
hello

If you’re wondering how you can reach that, you can splat an empty container:

julia> map(foo, []...)
hello
6 Likes

Ah, this is actually kinda obvious and, more importantly, consistent, thanks.

That’s just map though. If that function were instead named zipwith, then map, filter, accumulate could all be curried, which would be nicer. No?

There are many functions both in Base and across the data ecosystem that take a function argument and a data argument. Why add a special currying method to all of them, when this can be solved once and in a clean way using one of the “piping” packages?
I would recommend (my) DataPipes.jl (discourse thread) that has the main goal to make general data processing as convenient as possible. To my knowledge, DataPipes pipes have the least amount of code overhead compared to alternative packages. A basic example: @p rand(10) |> filter(<=(0.5)) |> map(_ + 1).

2 Likes

I think all of those can already be done via Base.Fix1(foo, bar), which fixes the first argument of foo to bar.

To be clear, we’d have to go to every function and add such a version manually, since automatic currying does not play nice with multiple dispatch. And adding these manually can run into the problem that e.g. map poses which would result in inconsistencies across functions, so either writing it explicitly or having a different name seems preferrable to me.

I’m also a bit puzzled by that overload too, i.e., it seems that map(f) is the same as f(). Independent of the name, i.e., zipWith, I fail to see you this is a consistent extension of map … in any case I would expect map(whatever) to return something iterable or a vector. Maybe I’m missing something here?

1 Like

The 0-arg map may get fixed (removed) in the future.

The others are discussed in this PR