Tail-call recursion

And as always, a self tail call can be easily rewritten as a loop. In this case:

function iseven(x)
    while true
        x == 0 && return true
        x == 1 && return false
        x -= 2
    end
end

Moreover, the only kind of tail call that cannot be easily rewritten as a loop is non-self tail call where the function being called is different than the function running. But that kind of tail call isn’t even automatically optimized in most (edit: see Scheme discussion below) languages with TCO because it makes stack traces impossible to understand.

2 Likes

My understanding was that scheme requires all tail calls to be optimized?

@stevengj already mentioned it, but speeding up doubly recursive calls (for dynamic programming or other things) is the “hello world” of memoization examples. It really sounds like you want something like this:

julia> using Memoize

julia> fib(n) = n < 2 ? 1 : fib(n - 1) + fib(n - 2)
fib (generic function with 1 method)

julia> @memoize Dict fib1(n) = n < 2 ? 1 : fib1(n - 1) + fib1(n - 2)
fib1 (generic function with 1 method)

julia> fib(1)
1

julia> fib1(1)
1

julia> @time fib1(45)
  0.005526 seconds (3.64 k allocations: 196.794 KiB, 99.26% compilation time)
1836311903

julia> @time fib(45)
  3.782468 seconds
1836311903

There are two different versions of TCO mentioned in the link here, might explain the misunderstanding (it’s a very intriguing document at least for Lisp):

https://0branch.com/notes/tco-cl.html

Especially in the prototyping phase, it can be useful to know where this functional style can be employed and where it’s likely to cause stack overflow.

At some point I wanted TCO for Julia, or at least considered it one of the few things missing, but when I understood it better, the drawbacks, I changed my mind, and support it as opt-in at best. But given it may be better for prototyping, maybe it could be a non-default option?

That’s not entirely clear to me. The spec says that “Implementations of Scheme must be properly tail-recursive” and a bit later “A Scheme implementation is properly tail-recursive if it supports an unbounded number of active tail calls.” So the requirement isn’t that all tail calls be eliminated, but that enough of them be eliminated that you can have an unbounded number of them active at the same time. A straightforward way to do that is to eliminate every tail call, but that’s not the only way.

2 Likes

I’ve read a few of the discussions about TCO before, and read the statement that “it’s easy to re-write to a loop”. But I never, until now, understood that TCO only applies to the ‘easily-re-writable cases’. I’m glad I finally caught on now, but it makes me wonder if this was clearly enough stated in some of the previous discussions. Those discussions always left me scratching my head about how on earth I could accomplish that re-write.

3 Likes

In any case scheme requires an unbounded number of them even if they are not self tail calls. Mutual recursion could even be quite complicated with functions calling other functions in tail position based on the output of a RNG for example. In practice I think it means all tail calls are eliminated.

1 Like

Yeah, that’s simplest implementation, but not strictly required. An interesting question is why does Scheme have this requirement? It guarantees this because it’s necessary to have such a guarantee for a language that only has recursion so that it can simulate iteration and other control flow constructs without blowing up the stack.

Does it make sense for Julia to have a similar guarantee? It’s not necessary since the language already has iteration and other control flow constructs built in. What would be the down side to eliminating all tail calls? It would, as I mentioned, make stack traces very confusing. For example, if you defined f(x, y) = 2x + y and there was an error when evaluating the + in that definition (suppose e.g. x and y are matrices but they have mismatched shapes), then looking at the stack trace you’d have no idea that f was called at all because the + is in tail position so the stack frame for f would be gone by the time the error occurred and it would look like whatever code called f had just called + directly. Super confusing. And if the call to f was in tail position, that stack frame could also have been eliminated, etc.—it could look like the + occurred arbitrarily far up the call stack.

4 Likes

The conclusion I’ve come to, which I explained in the linked GitHub thread, is that doing “best effort” tail call elimination in Julia is pretty useless. If you need the stack frame to be replaced by another call, then it should be an error for it not to happen. Scheme takes the approach of guaranteeing TCO for recursive tail calls, but that’s not a reasonable approach for Julia. And without a guarantee, TCO isn’t really helpful. So for Julia it makes much more sense to have an explicit “replace this stack frame with this function call” construct. Something like @return f(x) or return @goto f(x). Not sure the best way to spell it. That way if you have some recursive code that needs tail call elimination, you can get it. It would also allow writing state machines with tail calls. And it makes it less confusing when the caller is missing from the stack trace. The danger is this:

fib(n) = n ≤ 1 ? n : @return fib(n-1) + fib(n-2)

Someone will write this and think that they’re tail call to fib when in fact the function that’s in tail position there is + (which will be inlined anyway).

4 Likes

Thanks, I find this example very illuminating about the dangers of TCO.

1 Like

Since you answered me, I’m not arguing for TOC (by default), at least not strongly, let alone pushing for it. There seems to be both arguments for (what I quoted), and as you say against.

IF it sometimes better for prototyping (using functional style), then the question is, is it better to apply the macro in each case (and can be left in). If you need to apply to a lot of places, I’m not sure you can do to a whole file or more, so a CLI option could to it (or some other tool?).

Note, Lisp has loops (and isn’t functional, is fully imperative), unlike Scheme, and still some CL implementations chose to use TCO. I’m trying to inform myself and others why.

1 Like

I guess what I’m saying is that TCO isn’t useless, but it’s less useful than people think, and if it’s not guaranteed then it is pretty much useless. Further, if you are going guarantee it then you want to be clear about when it’s going to happen. Scheme takes the approach of “any time the stack might blow up”. For Julia it would make sense to let the caller control when guaranteed TCO happens.

2 Likes

I don’t really know since I’m fairly new here and have taken just a quick look at the provided links (especially the very long issue in github). But I think it doesn’t help thet elsewhere (quora, a bunch of stack overflow threads, etc) TCO is quite often brought in when discussing “is recursion slower then loops?”.
Now I get why they bring it up, if you really want to compare loops and recursion you have to do it where there is really a choice, i.e. when recursion can be rewritten as a loop and yes, in that case a discussion over presence or absence of TCO is relevant to determine if you might choose recursion over loops, or not. But IMO little time is spent to clarify the boundaries of that kind of discussion and maybe that’s why many people (including me) take home a more or less distorted message…

TCO is at best as fast as loops (or I think should always be guaranteed). [However recursion instead of loops in languages without TCO is at least a bit slower, where a loop will do, why they are then preferred.] But recursion is more powerful, but the extra power comes where TCO doesn’t apply, and Julia supports that, as fast or faster than TCO languages. At least my understanding.

TCO is to allow you (critics my say to force you) to only think with the recursive (functional) paradigm.

2 Likes

I imagine that there exist people that really like to think in terms of recursion and, even in the trivial case where a loop would be sufficient, would prefer to go that way (and then TCO would be a nice commodity to avoid paying a performance penalty for your style preference).

I say “I imagine” cause I always struggle when reasoning about recursion, hence cannot really get the feeling myself.

Full disclosure: I brought up myself the TCO discussion this time and I’m sorry if it felt frustrating, I had no idea about your previous history with the topic (nor I checked anywhere since I was not opening a thread, just briefly commenting elsewhere). So, I’ve just recently read a bit about the thing since I was evaluating to include a nice modern fortran library bringing facilities for functional style, as a dependency for my latest project. As I said I find the library very nice in the sense that I like a lot the API and how it could have been used in my code, but soon I discovered that I would need to pay a significant performance penalty. Therein there are many “trivial loops” (not even while, many of them could just be plain for loops) written as recursions and in fortran you don’t have a guarantee about TCO. (there are other problems with automatic allocation in the library but I won’t enter that now). When I reached out to the author he answered that for him the library was an exercise in style and a full TCO capable compiler would make recursion totally equivalent to loops there, so he was happy with the design and intended to not change it. I failed to see the detail about which kind of recursion we were talking about and took home a bigger message than was intended, that’s just it.

1 Like

Might it be helpful to add the TCO macro to Julia Base (but TCO NOT the default)? If anything, then it would be documented and maybe stop such discussions?

Julia is imperative first then functional, and I like functional more and more as the default (I just don’t like to be pressured into it). Not having TCO [macro] of any kind, doesn’t rule out its use, just by Base itself, but people might feel functional/recursive less supported (same argument with pattern matching).

https://martintrojer.github.io/clojure/2011/11/20/tail-calls-in-f-clojure-and-scala

Update 2; A clever compiler can convert mutual recursion into while loops. The a/b example above can be transformed into something like this;
[…]
However, I don’t know of any compiler that does this!

Currently, the Wasm design explicitly forbids tail call optimisations

Want support to enable

  • the correct and efficient implementations of languages that require tail call elimination […]

Considered that the original comment I was replying to is

Could this statement:

be made reasonable by just replacing “(deep?)” with “tail”?

In other languages writing tail recursions in lieu of loops is perfectly fine, while here the loop should be preferred. Hence here it is a bad habit to choose recursion in the trivial cases where a loop could do fine. Right?

Sorry to contributing to the cloud of “Julia doesn’t have this or that”. That wasn’t my intention at all! I can solve everything I want to solve in Julia. You compiler Base people are high wizards. I just wanted to see what people think of the recursion DAG and dynamic programming and talk about a place where recursion > loops that came up for me, that there’s this DAG way of unrolling the recursion that looks like a generalized tco to me. I’ve found it to be useful when I was missing recursion, and it seems like the kind of thing that could be automated by someone, to some degree. Someone who knows how to write a Julia macro, which is not me. I agree 1000% it’s not a Base thing.

There isn’t much advantage of moving it to Base. Base doesn’t need it and it doesn’t need base to work. It would mean that it’s development would be linked to Base versions.
A link to recommended packages in the docs if someone searches for TCO would be fine.

2 Likes

Thinking back on the incident in question, I now recall that my first implementation was a naive recursive implementation. My second implementation was a dynamic programming approach. The two implementations were not equivalent in terms of algorithmic time complexity, so that example was not actually a good example for comparing the performance of recursion and loops in Julia.

Apologies that my misremembered anecdote started this whole discussion. :sweat_smile:

Details:
I was writing a function to count the number of paths between two nodes in a DAG. The first naive recursive implementation was doing a lot of extra work because every time it encountered a node it would recalculate the number of paths from that node to the destination node. The loop-based dynamic programming approach (which I gleaned from the internet, of course) was much smarter and avoided redundant calculations.

1 Like