Setting up a Function Chain with Single Allocation

Hello all,

I would like to compose a function chain that will be applied to some data. The functions involved vary conditionally on said data, which isn’t known until runtime. Is there a way to do this without storing each function separately and composing them all at once towards the end? Since this is inside a rather tight loop, it would be nice to keep allocations to a minimum. As an example, this works fine:

array = [1, 2, 3, 4, 5]
f = x -> filter(e -> iseven(e), x)
g = x -> filter(e -> isodd(e), x)
h = x -> x |> f |> g

julia> h(array)
Int64[]

However, if I try to just build upon an existing function by reusing the binding:
f = x -> f(x) |> filter(e -> isodd(e), x)

This apparently becomes infinite recursion, as I get a stack overflow. Presumably the f(x) becomes self-referential instead of the previous function binding. Is there any way to get around this or am I just going to have to create a vector of functions to store everything?

Maybe composition works better than chaining? Havent tried though.

https://docs.julialang.org/en/v1/manual/functions/#Function-composition-and-piping

First of all, it’s worth asking what the real goal here is. Giving a name to a quantity doesn’t cause any more allocation than not naming it. So if your code is working as-is, you might not need to worry at all.

That said, you might find the (\circ) function composition operator useful:

julia> f = x -> filter(iseven, x)
#13 (generic function with 1 method)

julia> f = f ∘ (x -> filter(isodd, x))
var"#13#14"() ∘ var"#15#16"()

julia> f([1,2,3,4])
Int64[]
1 Like

Thanks for that. The composition approach seems to work, but I’m a little puzzled why it does when chaining doesn’t, as I’m used to composition and piping being two sides of the same coin (that is to say, f |> g being equivalent to g ∘ f). Is there something going on under the hood that’s different in Julia that causes the f to be evaluated differently depending on the case?

The problem here isn’t with piping, but with ->. This creates a new function and is (essentially) equivalent to:

function f(x)
    filter(iseven, x)
end

function f(x)
    f(f(x))
end

f(a) # segfault

In the case above, it makes perfect sense why it would recurse infinitely: the first method for f is overwritten by the second. You might have expected the “anonymous” aspect of the function (using -> instead of function) to take care of this by evaluating the right side before assigning it back to f, but that doesn’t seem to be the case. The behavior of -> in this context is the same as using function.

The reason composition works is precisely because it evaluates the right side before assigning it to the left. Calling returns a ComposedFunction (that keeps a reference to its inputs). That object, which stores the current value of f, is then named f.


By the way, it may be worth mentioning that if your real goal is to do repeated filtering in this way, it is much cheaper to pass the composition of multiple conditions directly to filter:

f = x -> filter(iseven ∘ isodd, x)

This doesn’t allocate an intermediate array like calling g(f(x)) does.

2 Likes

Yes, that makes sense. I didn’t grok that apparently |> doesn’t actually compose functions, it just sends the return value to the next function in the pipe. That distinction wasn’t entirely clear to me, especially as they’re under grouped under the same heading in the manual.

I actually do have multiple filters and had been contemplating combining them into one, but unless I’m mistaken, the condition functions would need to accept and return both a boolean and the data being tested… I also believe the function chain wouldn’t short-circuit evaluate like a series of statements, so some testing is needed.

Thanks to all for all the help!