Objective
To satisfy Julia’s piping/chaining/currying/partial application problem with an elegant and functional approach worthy and idiomatic of Base Julia.
If you want, skip the background reading material straight down to the proposal.
Background
Piping, chaining, threading—whatever people call it, it’s a very common thought pattern to take an object and pull it through a sequence of one or more transformations. In many contexts it’s most natural to think of the object first, and then to think through the sequence of methods that act on it (or on transformed versions of it). As a consequence, many languages implement some mechanism for chaining methods to reduce the programmer’s mental load. Likewise, the English language reserves a keyword, “it,” to allow an object to be held in memory while performing a sequence of operations on it.
Julia implements the pipe operator |>
for exactly this purpose, but unfortunately its inability to specify more than one argument makes it lame. As a result, about a dozen macro packages have been written (see appendix), attempting to solve the problem in slightly different ways. These macros use non-standard and non-standardizable syntax (i.e., syntax which cannot be considered idiomatic Julia and cannot exist outside of a macro call), meaning there’s little chance of seeing proper adoption into the Base language. Hence, the solution remains fragmented.
Perplexingly, many of those macros take a concept from an OOP language, Scala.
Until we solve this, we will continue to see requests and complaints (see appendix), as such a common thought pattern which is solved satisfactorily in many other languages is instead “supported” in Julia by a collection of competing macros with non-standardizable syntax. As much as Julians might wish the problem to go away, it simply doesn’t and will likely remain a lightning rod indefinitely.
What some other languages do
OOP Languages
OOP languages (such as C++, Java, and Python) employ dot-syntax to call member methods, offering a high degree of convenience when chaining object transformations using syntax like myObject.meth1(arg1a, arg1b).meth2(arg2a, arg2b)
. The reduction in mental load is so great—because it matches a natural thought pattern certainly, but also in good part because of the dot-tab-autocomplete this method call syntax enables—that it’s tempting to define functions as class members even when they needn’t be, just to reap the benefits of convenience that member methods enjoy. The package ObjectOriented.jl implements this.
Languages such as D and Nim implement UFCS (Universal Function Call Syntax), which appears exactly like dot-syntax but applies global functions and simply passes the dotted object as a first argument into the function:
myObject.meth1(arg1).meth2(arg2) == meth2(meth1(myObject, arg1), arg2)
Clojure
Clojure has two macros, ->
and ->>
, for “threading” objects through a chain of function calls. The thread-first macro, ->
, takes an object and threads it through the function calls as a first argument. For example:
(-> obj (func1 arg1) (func2 arg2) (func3 arg3))
;; is equivalent to
(func3 (func2 (func1 obj arg1) arg2) arg3)
meanwhile, the thread-last macro, ->>
, threads the object as a last argument through the function calls. For example:
(->> (range 10) (filter odd?) (map #(* % %)) (reduce +))
;; is equivalent to
(reduce + (map #(* % %) (filter odd? (range 10))))
The package Lazy.jl implements these threading macros as @>
and @>>
.
Scala
Scala, an OOP language which has dot-syntax for method calls, additionally employs “placeholder syntax” which uses the _
character as a placeholder for the argument of an auto-generated anonymous function. For whatever reason, this seems to be the approach many Julians have set their sights on; a bunch of Julia packages have been written to mimic this behavior (albeit, in slightly different ways): DataPipes.jl, Pipe.jl, Chain.jl, Underscores.jl, and Hose.jl, and several requests written about it: #24990, #46916. A strange justification for this approach is in circulation: a claim that in a multi-dispatch language no argument position should be treated as more privileged than any other.
Proposal
As outlined in this brainstorming session, what is proposed is a pair of partial application types Base.FixFirst
and Base.FixLast
, and a corresponding pair of infix operators />
(“frontfix”, or “fix”) and \>
(“backfix”) respectively for syntactical sugar defined something like this:
struct FixFirst{F,X} f::F; x::X end
struct FixLast{F,X} f::F; x::X end
(fixer::FixFirst)(args...; kwargs...) = fixer.f(fixer.x, args...; kwargs...)
(fixer::FixLast)(args...; kwargs...) = fixer.f(args..., fixer.x; kwargs...)
/>(x, f) = Base.FixFirst(f, x) # frontfix
\>(x, f) = Base.FixLast(f, x) # backfix
The types Base.FixFirst
and Base.FixLast
are simply the logical extensions of the existing fix types Base.Fix1
and Base.Fix2
, when extended to functions of other than just two arguments.
The fix and backfix operators />
and \>
have right-associativity, bind more tightly than function calls, and have the same operator precedence as each other.
These properties allow these operators to satisfy the desire to chain operations on an object threaded as a first or last argument through a sequence of methods, as so:
# Frontfix (Base.FixFirst)
my_object /> meth1(args1...) /> meth2(args2...) /> meth3(args3...) ==
meth3(meth2(meth1(my_object, args1...), args2...), args3...)
# Backfix (Base.FixLast)
my_object \> meth1(args1...) \> meth2(args2...) \> meth3(args3...) ==
meth3(args3..., meth2(args2..., meth1(args1..., my_object)))
Notice that fix />
satisfies the role that dot-notation serves in OOP and UFCS languages. As we will see, it is more powerful. Inspiration has also been drawn from Clojure’s threading macros in suggesting backfix \>
. However, because fix />
and backfix \>
are infix operators, they are more convenient and useful than Clojure’s macros too. Also, as Base.FixFirst
and Base.FixLast
are logical extensions of Base.Fix1
and Base.Fix2
, but with syntax sugar in the form of />
and \>
, they should satisfy the desire to generalize and bring elevated status to the fix operations.
Examples
Let’s see how these operators can be put to use with some examples. Later, you can play with a demo macro to see how it works (see bottom).
Frontfix Chaining: XML Tree Navigation
document/>root()/>firstelement()/>setnodecontent!("Hello, world!")
# same as
setnodecontent!(firstelement(root(document)), "Hello, world!")
Frontfix Chaining: Starting a Spark Session
SparkSession.builder/>appName("Main")/>master("local")/>getOrCreate()
Frontfix Chaining: Sequence of Transformations
"Hello, world!"/>replace("o"=>"e")/>split(",")/>join(":")/>uppercase() ==
uppercase(join(split(replace("Hello, world!", "o"=>"e"), ","), ":")) ==
"HELLE: WERLD!"
Backfix chaining
[1, 2, 3]\>filter(isodd)\>map(x->x^2)\>sum()\>sqrt() ==
sqrt(sum(map(x->x^2, filter(isodd, [1, 2, 3])))) ==
3.1622776601683795
Combining Frontfix, Backfix, and Broadcasting
"1 2, 3; hehe4" \> eachmatch(r"(\d+)") \> first.() \> parse.(Int) /> join(", ") ==
join(parse.(Int, first.(eachmatch(r"(\d+)", "1 2, 3; hehe4"))), ", ") ==
"1, 2, 3, 4"
Replacing Base.Fix1 and Base.Fix2
NamedTuple{names, T}(map(nt /> getfield, names)))
# same as
NamedTuple{names, T}(map(Base.Fix1(getfield, nt), names))
Base.values(x::MyStruct{K1,<:Any,K2,<:Any}) where {K1,K2} =
(values(x.a)..., map(x.b /> getfield, filter(K1 \> ∉, K2))...)
# same as
Base.values(x::MyStruct{K1,<:Any,K2,<:Any}) where {K1,K2} =
(values(x.a)..., map(Base.Fix1(getfield, x.b), filter(Base.Fix2(∉, K1), K2))...)
Fixing and Composition
f = (", " \> join) ∘ ((2 \> ^) /> map) ∘ ((Int /> parse) /> map) ∘ (r",\s*" \> split)
f("1, 2, 3, 4")
== "1, 4, 9, 16"
Fun Properties
Frontfix />
and backfix \>
produce partially applied functions where one argument is pre-filled. Because they’re right-associative and have higher precedence than function call, they can progressively fill out a partial function until it’s fully applied.
To illustrate how this works, frontfix />
pops arguments out from the left side (or front) of the parentheses:
f(a, b, c, d) ==
a/>f(b, c, d) ==
b/>a/>f(c, d) ==
c/>b/>a/>f(d) ==
d/>c/>b/>a/>f()
the dual operation is backfix \>
, which pops arguments out from the right side (or back):
f(a, b, c, d) ==
d\>f(a, b, c) ==
c\>d\>f(a, b) ==
b\>c\>d\>f(a) ==
a\>b\>c\>d\>f()
It’s not likely that this capability will be fully realized, but it’s nifty to understand. As a result of it, the operators have some overlap in their use:
[1, 2, 3, 4] \> filter(iseven) \> map(sqrt) /> join(", ") ==
[1, 2, 3, 4] /> iseven /> filter() /> sqrt /> map() \> ", " \> join() ==
join(map(sqrt, filter(iseven, [1, 2, 3, 4])), ", ")
and />
can pipe an argument into the second position of a many-argument function:
arg2 /> arg1 /> method(args[3:end]...)
note that for single-argument functions, fixing into the front and fixing into the back are equivalent.
As a weird bonus, you can also evaluate expressions in Reverse Polish Notation:
3 \> 4 \> -() \> 5 \> +() == (3 - 4) + 5 == 4
and also in whatever notation this is:
3 /> 4 /> -() /> 5 /> +() == 5 + (4 - 3) == 6
Common Pushback & Responses
You just want syntax sugar to make OOP programmers happy.
True. OOP languages landed on a really good idea: the ability to feed successive transformations of an object through a sequential chain of function calls in a convenient and powerful way.
But also, Not true. This isn’t method encapsulation; this is just an elegant use of functional ideas that also does a good job solving a pattern that OOP languages accidentally found themselves good for, in a way that fills a void in Julia’s supported idioms.
In other words, it’s a generic solution to multiple classes of problem.
I don’t see how object→method1()→method2() chaining is better than method2(method1(object)).
One is not better than the other; they are complementary.
We frequently think of the method before the object when we are focused on a technique: perhaps a composition of functions to perform a task, or perhaps when combining several dissimilar objects. That’s where nesting objects in method argument lists is natural. However, when we are focused on transforming a single complex object in preparation for something else, we think of the object before the method.
Which format is better depends on context, and in contexts relevant to fixing an argument, one of the most common and compelling reasons for doing so is to thread an object through a sequence of transformations.
Part of the desire for dot-syntax comes from tab autocomplete. But soon Julia will have tab autocomplete for methods that match an argument type signature anyway, making this a moot point.
I have my doubts. Remembering to type a question mark first, then get halfway through typing the arguments before pressing tab, and then scanning a large list of methods all because you forgot a function name, is much more mental overhead than our OOP brethren suffer in this matter.
Maybe @tim.holy can debunk my stance, but it seems hard to beat typing an object name, pressing .
, immediately seeing a list of all available methods, and watching the list filter down as characters of the method name are typed in.
I’ve tried ?(x, <tab>
in the REPL, and if we can expect anything resembling its current behavior, it’s unlikely I’ll use it.
Julia is a multiple-dispatch language, so we don’t treat any argument as more special than any other.
Not true; who made this up? Julia even has a special do
statement to insert an anonymous function into the *first* argument, proving we don’t actually believe this claim anyway.
Furthermore, the first or last ordered (non-keyword) argument of any function is usually chosen to be “important;” to define a function otherwise is usually poor design, regardless of the language paradigm. This is reflected in Clojure’s threading macros ->
and ->>
, wherein objects are threaded either through the first or last argument.
Sidenote: currently args...
syntax can only slurp the last arguments of a function, not the middle or first. This could cause frontfix />
to be used more frequently than backfix \>
.
Why not just use xyz.jl macro package? That should be good enough for anybody!
I like Julia in part because generators, matrix multiplication, regular expressions, complex numbers and so on are builtin and standard. What’s so wrong about wanting a better chaining operator to be builtin?
I’ve counted a dozen packages that implement chaining in some form or other. For such a common and generic idiom, maybe leaning on the package ecosystem for this is the wrong approach.
But the xyz.jl package uses
_
to mark argument insertion. That’s more flexible; why don’t you get behind that instead?
If _
placeholder syntax stands a chance of being implemented in the language, I might. That approach has been proposed for half a decade now with no progress, e.g. #24990; so forgive me.
I also think frontfix and backfix will likely play nicer with autocomplete, and provide more interesting types for multiple dispatch.
Why introduce new operators
/>
and\>
? Why not just use a pre-existing operator such as⇝
?
Two reasons:
- To avoid changes that might break existing packages that already use the symbol.
- To use ASCII characters, as these operators will likely be used often so should be easily accessible.
I’m open to other ideas for what characters should be used, but after exploring numerous options this seemed best.
It’s ugly. Should we require spaces around
/>
and\>
?
I’m hoping that good syntax coloring will solve this.
Appendix
Packages Addressing Piping/Chaining
- DataPipes.jl
- Chain.jl
- Pipe.jl
- Hose.jl
- Transducers.jl
- Lazy.jl
- Underscores.jl
- PartialFunctions.jl
- Chainable.jl
- ChainedFixes.jl
- ObjectOriented.jl
- Objects.jl
- CBOOCall.jl (not quite, but sorta)
List of Complaint & Query Threads
- My mental load using Julia is much higher than, e.g., in Python. How to reduce it? - General Usage - Julia Programming Language (julialang.org)
- Is it possible to pipe to a function of several variables? - General Usage - Julia Programming Language (julialang.org)
- How to discover functions which apply to a given object? - New to Julia - Julia Programming Language (julialang.org)
- Allowing the object.method(args…) syntax as an alias for method(object, args …) - Internals & Design - Julia Programming Language (julialang.org)
- Define map, filter, … also as functionals - Internals & Design - Julia Programming Language (julialang.org)
- Does julia support chaining of collections methods / function chaining at this point? - General Usage - Julia Programming Language (julialang.org)
- Function chaining with |> and filter function - General Usage - Julia Programming Language (julialang.org)
- Function chaining · Issue #5571 · JuliaLang/julia (github.com)
- A thought on function chaining with constant Arguments - General Usage - Julia Programming Language (julialang.org)
- How often do you use the |> operator? - General Usage - Julia Programming Language (julialang.org)
- How to discover functions which apply to a given object? - New to Julia - Julia Programming Language (julialang.org)
- Piping in Julia - New to Julia - Julia Programming Language (julialang.org)
- When to use pipes? - Offtopic - Julia Programming Language (julialang.org)
- Method chaining with arguments to imitate unix piping? - General Usage - Julia Programming Language (julialang.org)
- Julia: Piping operator |> with more than one argument - Stack Overflow
- Method of struct - General Usage - Julia Programming Language (julialang.org)
- Object-oriented syntax in julia - General Usage - Julia Programming Language (julialang.org)
Proposals to Fix (not Base.Fix) it
- Functional application, reconciling currying with multiple dispatch - Internals & Design - Julia Programming Language (julialang.org)
- RFC: curry underscore arguments to create anonymous functions by stevengj · Pull Request #24990 · JuliaLang/julia (github.com)
- Improvement for function chaining operator - Internals & Design - Julia Programming Language (julialang.org)
- Chaining of functions - General Usage - Julia Programming Language (julialang.org)
- Make @chain base Julia - General Usage - Julia Programming Language (julialang.org)
- Anonymous functions with anonymous arguments · Issue #46916 · JuliaLang/julia (github.com)
- Function chaining · Issue #5571 · JuliaLang/julia (github.com)
These lists are a WIP, as I continue to find more examples.
Demo
This implements a simple proof of concept so you can play with these operators and measure how much you like (or dislike) them. It’s a hack, so it should NOT be used for anything serious.
Many thanks to @bertschi for providing the code I copypasted and for helping brainstorm.
Demo Code:
using MacroTools: postwalk, @capture
struct FixFirst{F,X} f::F; x::X end
struct FixLast{F,X} f::F; x::X end
(fixer::FixFirst)(args...; kwargs...) = fixer.f(fixer.x, args...; kwargs...)
(fixer::FixLast)(args...; kwargs...) = fixer.f(args..., fixer.x; kwargs...)
frontfix(x, f) = FixFirst(f, x)
backfix(x, f) = FixLast(f, x)
# ugly hack: it's easiest to use `+` and `++` because they take varargs
macro fixdemo_str(expr)
let + = frontfix, ++ = backfix
expr = replace(expr, "++"=>"⤔", "+"=>"⤕", "/>"=>"+", "\\>"=>"++")
expr = Meta.parse(expr)
expr = postwalk(expr) do ex
if @capture(ex, +(args__))
unchain(+, args, [])
elseif @capture(ex, ++(args__))
unchain(++, args, [])
else
ex
end
end
expr = replace(string(expr), "⤕"=>"+", "⤔"=>"++")
Meta.parse(expr)
end
end
function unchain(op, terms, stack)
if isempty(terms)
:(foldr($op, [$(stack...)]))
elseif @capture(terms[1], f_(args__)) && f != :foldr # don't touch nested transforms again
arg = :(foldr($op, [$(stack...), $f])($(args...)))
unchain(op, terms[2:end], push!([], arg))
else
unchain(op, terms[2:end], push!(stack, terms[1]))
end
end
Some fun examples to try:
fixdemo""" "Hello, world!" /> replace("o" => "e", "l" => "r") /> split(",") \> map(uppercase) /> join(":") """
fixdemo""" [1, 2, 3, 4] \> filter(iseven) \> map(sqrt) /> join(", ") """
fixdemo"1 /> +(2)" # you will never do this (or will you?)
fixdemo"rand(10) \> filter(≤(0.5))" # define-map-filter-also-as-functionals/87401
fixdemo"(1:10)\>filter(iseven)\>map(sqrt)/>collect()" # define-map-filter-also-as-functionals/87401/3
fixdemo"rand(10) \> filter(≤(0.5)) \> map(1 \> +)" # define-map-filter-also-as-functionals/87401/8
fixdemo"(1:100) \> mapreduce(2 \> ^, +) /> √()" # root of sum of squares
fixdemo"(1:5) \> filter(isodd)"
fixdemo"(1:5) /> isodd /> filter()"
oddfilt = fixdemo"isodd /> filter"
fixdemo"(1:5) /> oddfilt()"
hello_replace = fixdemo""" "Hello, world!"/>replace """
hello_replace("l"=>"r")
hello_replace("o"=>"a", "l"=>"y")
fixdemo""" "1 2, 3; hehe4" \> eachmatch(r"(\d+)") \> map(first) \> map(Int/>parse) /> join(", ") """
fixdemo"3 \> 4 \> -() \> 5 \> +()" # rpn for the funz
f = fixdemo""" x -> x /> split(r",\s*") \> map(Int /> parse) \> map(2 \> ^) /> join(", ") """
f("1, 2, 3, 4")
g = fixdemo""" (", " \> join) ∘ ((2 \> ^) /> map) ∘ ((Int /> parse) /> map) ∘ (r",\s*" \> split) """
g("1, 2, 3, 4")
Play around with it and share your thoughts please! Note that when partial functions start getting busy, it’s often better just to define a conventional anonymous function; some of the examples are mildly excessive just for illustration.