Tail-call recursion

Does Julia have tail call elimination? And how good it is implemented? To clarify: it really is a design choice somehow, strongly functional oriented or pure functional languages normally are really good at this, as recursion is a central paradigm, but others languages that are generally regarded as blazing fast, sometimes just fail at that (e.g. fortran, with some compilers). So, as it is with loops in Julia (and others) recursion can be fast, depending on the language and its implementation…

1 Like

I suspect that multiple dispatch makes general tail call elimination problematic in Julia. It’s possible it could be done for some cases though. In this case where there’s type instability it sounds like probably not.

1 Like

Julia doesn’t have tco currently, and is fairly unlikely to add it.

I understand. Well, then (deep?) recursion is probably to be regarded a bad habit here, as loops are in python, R, Matlab…

right; so, there’s this comment @StefanKarpinski that I myself have :+1: ed a long time ago tail call elimination · Issue #4964 · JuliaLang/julia · GitHub .

Except here I really need it :rofl: and I think this pattern is fairly common in geometry code (maybe some ray tracing code that allow layer-ed material)

No, there’s a closed issue on it since it’s not wanted (it’s sometimes bad), but it can be done with a macro, and has been. From the issue, what I wrote:

Isn’t the/a @tailrec macro good enough (it doesn’t support mutual recursion), which gets around the limitation?

(@v1.8) pkg> add https://github.com/TakekazuKATO/TailRec.jl

julia> using TailRec
julia> @tailrec fib(n, a=zero(n), b=one(n)) = n == 0 ? a : fib(n-1, b, a+b)

Maybe someone should register that package (or is there an even better one or something wrong with that one?). That user doesn’t seem very active, and I’m not sure what the policy is on registering packages from others. It has an OSI license, a requirement for registration, so anyone can fork it and use (registered or not), or just use from there.


I would say it’s almost trivial to re-write if there’s no mutual recursion.

There is A LOT OF mutual recursion here btw

Nothing wrong with recursion. The interesting applications of recursion aren’t in tail position anyway. With any non-tail recursion, in any language, you can make it plenty fast as long as the base cases are large enough. FFTW is recursive, for example.

Tail-call optimization is mostly important in functional languages where recursion is used instead of loops. In imperative languages with loops, you only really care about recursion in non-tail situations where TCO is irrelevant.


See also Yet another TCO thread and Does Julia have tail call optimization.

I don’t think there’s much new to be said here. If you think tail-call optimization is important for Julia, you probably don’t really understand what it’s for.

1 Like

Yeah I didn’t intend it to be a whole new thread, was just commenting about a post therein suggesting that recursion could be in principle slow, if compared to loops (which is generally not the case I’d say).

Anyway it was just that I feel like Julia has some nice affinity with the functional paradigm, so it could have been nice to have the possibility to use it to write pure functional stuff. But of course it’s not something to die for, if not liked (which is what I get skimming through the linked threads and the github issue).

I find its really nice in dynamic programming problems when you don’t know what the terminal state(s) will be to just write the recursion. In a pure functional language the calculation actually gets partially done at the level of the abstract syntax tree (the reachability analysis part). It’s NOT just a loop!

I think a general, Julian solution is to actually explicitly build out a state call tree in a graphical tree structure and work backwards from the terminal states (plural because in higher dimensional recursions there is more than one “tail”) then you can control what gets memoized etc. This is what I was forced to do in the DP problems I worked on.

P.S. Using Julia’s type system to solve recursion is a bit like using dispatch as a dictionary or a series of if-statements. Yes, it can be done, but recursion limit sort of indicates that it doesn’t scale well. You don’t want to be generating GB of call stack data. I’m not sure if there is a fundamental limitation in how function calls are represented, or it’s just too much inflexibility to guarantee this while trying to incrementally improve the compiler inference. I suspect a macro god could reproduce the explicit tree solution in magical form but that’s for a comp-sci phd to do while tripping acid, not for a stiff who wants to get a problem solved.

1 Like

That doesn’t sound like a tail call? (Almost by definition, if it’s not just a loop, it’s not a tail call.)

Maybe its no good calling it tco. But it looks the same to me. You run the calculation backward from the terminal states (x_0 , x_1) in fib case. In say a one armed bandit it’s all possible accesible payouts at the terminal time.

Even in the 1-D case sometimes, you don’t know what the terminal state will be for the function call. Then there’s a forward part (figuring out the terminal state) and a backward part to the problem. The recursion can still be unrolled as a loop once you figure out what the terminal state will be. If we define TCO as “loops that can be unrolled by hand because we already know the terminal state.” then of course there’s no problem: unroll the loop by hand! But it’d be nice if recursion in this slightly more general sense: “call tree gets built to terminal state and then collapsed back to the root” could be written using the same elegance that Julia shows for other problems, especially since there’s supposed to be a lispy bit somewhere in Julia’s DNA.

A tail call is when the result of the function is immediately known and returned by the function once the tail call returns, it can always be replaced by a go-to with no memory on the stack.

TCO isn’t in Lisp, i.e. the original, nor later Common Lisp (is though in Scheme, a Lisp off-shoot, the first (standardized) language to have it?), the only standardized Lisp. Some think it’s a mistake to have it and e.g. Clojure doesn’t have it (nor does Emacs Lisp), the most modern Lisp.


While the Common Lisp standard does not require implementations to eliminate tail calls, some do

[Yes, e.g. SBCL has it, and it’s the most common implementation, but if it’s not in the standard you can’t rely on it.]

In what follows, I distinguish between Tail Call Optimisation (TCO)—where all calls in tail position are optimised into jumps—and Tail Self-Call Optimisation, a special case of TCO where only self-calls are optimised (this is usually what people want when they refer to TCO as it allows for the recursive style). TCO is sometimes referred to as proper tail recursion.

See: Common Lisp was designed without requiring TCO because: \* various platforms don... | Hacker News

From Wikipedia:

Tail recursion modulo cons is a generalization of tail-recursion optimization introduced by David H. D. Warren[9] in the context of compilation of Prolog, seen as an explicitly set once language. It was described (though not named) by Daniel P. Friedman and David S. Wise in 1974[10] as a LISP compilation technique.

Top comments here: One thing I tend to miss in Clojure is lack of TCO. Most people(like me), come t... | Hacker News

One thing I tend to miss in Clojure is lack of TCO. Most people(like me), come to Lisp from SICP, The Little Schemer and PG’s books. Where recursion is not just emphasized, you are actually trained to think recursively.


Most mainline Lisps either never did support TCO or only in some restricted way. The reason for that is that it clashes with other widely used language features in typical Lisps. There is a cost to supporting TCO - also a less nice debugging experience. Many CL compilers tend to support it with some restrictions - CL interpreters often not. Lisp Machines did not support it. CL on the JVM does not support it. Tail Call Optimisation in Common Lisp Implementations

I started with Scheme and Lisp (Standard Lisp, ZetaLisp, CL, …) and wrote/used TCO a lot. I tend to view the use of TCO for basic iteration as a mistake, which makes code harder to read. I tend to like either higher-order iterative constructs or powerful iteration constructs like CL’s ITERATE: iterate, the extensible iteration construct

I find the Clojure hack ugly.

See gist: Improve TCO handling on Common Lisp easy cases · GitHub

Similar available for Emacs, and as I posted in Julia, and I think best to have it in a macro, i.e. opt in.

TCO is in FemtoLisp (since it’s a Scheme), and it IS in Julia (but will be going away), see: julia --lisp

1 Like

Knowing the terminal state has nothing to do with TCO, any more than a while loop needs to know the number of iterations in advance.

The “fib case”? Do you mean the classic fib(n) = n < 2 ? 1 : fib(n-1)+fib(n-2) recursion example? That is not tail recursive.

TCO does not mean “all recursion becomes fast”. It literally only applies to recursion that can be trivially rewritten as a loop, and hence is not needed in languages with imperative-style loops.

“call tree gets built to terminal state and then collapsed back to the root”

That sounds more like recursion unrolling, which is totally distinct from TCO and is still a research problem to make practical AFAIK.


I can’t argue about terminology, I’m always wrong anyway, so I’m just going to say exactly what I’m talking about. :sweat_smile:

Say you write fib(n) = n < 2 ? 1 : fib(n-1)+fib(n-2) then call

julia> fib(500)
# not gonna work

No big deal rewrite this as

function fib(n)
x = 1, xp1 =1
for ii in 2:n
(x,xp1) = (xp1, x+xp1)
return xp1

But what if you had gib(n) = mod(10,n) < 2 ? 1 : gib(n - 1) + gib(n - 2)
You have to think a little about what the terminal state for a particular call is going to be. You can still unroll the loop, but the general solution to unroll this and prevent 2^n calls is quite easy to explain to a computer

Graphically, we have to build and then unbuild a directed acyclic graph for our function. All finite-state solvable recursions can be represented this way.

Depth first construction of the DAG until all leaves are terminal

* terminal

gib(15)          gib(13)            gib(11)* 
  \---->------------^   \               /       
                         \             /       

gib(15) → gib(14) → gib(13) → gib(12) → gib(11)* gib(10)*
  \---->-----\------^   \     /   \     /       /
              \---->-----\---/     \   /       /
                          \----->---\-/       /

Breadth first evaluation

gib(15) <-gib(14) <-gib(13) <-gib(12) <-1        1
  \---->-----\------^   \     /   \     /       /
              \---->-----\---/     \   /       /
                          \-----<---\-/       /

gib(15) <-  5    <-   3    <-  2  <-    1       1
  \----<-----\------^   \     /   \     /       /
              \----<-----\---/     \   /       /
                          \-----<---\-/       /

8     <-   5    <-    3    <-  2  <-    1       1
  \----<-----\------^   \     /   \     /       /
              \----<-----\---/     \   /       /
                          \-----<---\-/       /

Unrolling this generally in a parallel way is research, but I think on one thread it’s a pretty straightforward problem if you are a good programmer, and would help showcase Julia macro power. Hopefully this nerdsnipes someone smarter than me into writing a beefy @recursion macro that lets me solve dynamic programming problems in one line although maybe GitHub - TakekazuKATO/TailRec.jl: A tail recursion optimization macro for julia. already does this?

Do you have an example of any language or compiler that performs the optimization you want? That would be more constructive.

(Or do you just want memoization? This is of course easy with a macro, but is not the sort of thing a compiler should do without the programmer explicitly requesting it.)

It’s not just a question of terminology. It’s a question of real vs. imaginary.

You have to understand that it’s pretty frustrating for us to get repeated threads where people are like “Julia doesn’t have TCO, it needs TCO to have fast recursion, why does it suck compared to functional languages, why don’t you want recursion to be fast”, and then it turns out inevitably that they have no idea what TCO is or why it is/isn’t useful, and the optimized recursion they are imagining is instead some entirely different and possibly fictional concept.


Just to be clear, this is not a tail call because after calling fib(n-1) the value must be remembered so that you can call fib(n-2) and then add them together. A tail call is where the value of the function is the value of the recursive call.

iseven(x) = x==0 ? true : x==1 ? false : iseven(x-2)

Is an example, the value of iseven(x) is either directly computed in the base cases, or it is iseven(x-2) and there is no need to do further processing of the recursive call.

That’s what a tail call is.