The Python debate on Tail Calls

The Beginning

It started with an innocent enough blog post by Guido van Rossum commenting on Python’s lack of tail call elimination in Python’s history.  Okay, not even commenting – it was a parenthetical sentence fragment!

The internet went crazy with comments.  Guido attempted to clarify,

However, some technical inaccuracies – or rather, technical issues that the Python language and runtime currently don’t address – led to only more debate.  Largest oversights were in his discussion about how tail calls might be implemented in Python,

The first observation is that you can’t do it at compile time. I’ve seen at least one blog entry that used a bytecode hack to replace a CALL opcode immediately before a RETURN opcode with a jump to the top of the function body

Going on to imply that tail recursion was somehow thwarted by dynamic function lookup, that it could only be done when the compiler could statically transform all tail call sites into gotos. 

He corrected this in the followup, but then went on to say,

Now, apart from the observation that since TCO essentially is a GOTO, you write spaghetti code using TCO just as easily, Ian Bicking gave a solution that is as simple as it is elegant. …

Where he then presented a classic simple implementation technique for implementing tail call elimination – the trampoline.  Of course, you have to do this manually in Python unless you modify the bytecode compiler to do it for you. 

((Of course, I can’t help pointing out that non-tail-call procedure calls are also like GOTOs.  Guy Steele discussed this analogy/parallel in “Debunking the ‘Expensive Proceedure Call’ Myth, or Proceedure Call Implementations Considered Harmful, or Lambda: the Ultimate GOTO”, available on The Original ‘Lambda Papers’.  I’d summarize if I thought there was anything I could add, but there really isn’t))

Rebuttals

Joe Marshall has provided the most comprehensive replies to this discussion that I’ve seen, in his 5 part series:

Part 4 is the most focused piece I think, where he addresses the issues I mentioned above:

[example of dynamic tail cal] This will work in any version of Scheme. The compiler doesn’t need to analyze the target of the function call, it only needs to notice that the caller’s stack frame is no longer in use.

And the topic of debugging:

MIT-Scheme has an approach that seems promising. […] the history is a ring buffer with a limited size. When a tail recursive call is recorded in the history, each frame in the history moves down by one. If a sufficient number of calls are made, the oldest record is deleted

He also mentions in part 2 that debugging tail calls is just something that Schemers learn. 

But I have this to add:  Yes, it’s terribly convenient to have a stack history when debugging in Python – most of the time you don’t need ever stack frame, but once in a while its useful.  But the same thing happens with loops.  A failure is detected, but this loop iteration didn’t cause the failure, the cause was 2 iterations back and we don’t have that state any longer.  Darn.

A key observation in Scheme is that looping and tail recursion are two abstractions for doing the same thing – iteration over a regular data structure or sequence.  Because the problem is the same, the useful debugging features are the same – in both cases, a history is a useful debugging aid, but also bleeds runtime performance. 

Confusion about the abstraction

I also want to defend Guido though (not that he needs defending, he can stick up for himself, but for the sake of argument).  He doesn’t need tail recursion to do iteration, or visitor dispatch, or etc.  C programmers, those still around, will insist they don’t need C++ classes and templates to do objects, and rightly so.  And a Perl programmer will likely tell you they don’t need anything else to do anything.  (apologies to the Perl programmers, no offense intended).

And in all these cases it’s true.  Guido get things done without tail recursion.  And that’s what Python is about – getting things done.  And Schemers get things done too, in the minimalist way that Schemers like and are accustomed to.

I’m not going to pretend the various programming languages are equal, but I don’t think there’s a total order either.  Even if there is, even if X is a wholey better way-to-do-things than Y, or even mostly better, this whole debate has failed in one very essential way – this discourse was too much a debate about betters and worsers, and not enough a discourse about understanding the other.

I think commenter Martin demonstrates this best: “I fail to see why this is such an important feature that at least a significant amount of people have an opinion about it.” If, after so many words, this is the most common result, then I both sides have failed.

Kudos to Guido for correcting his initial errs, I think he’s done well to describe the Python Way and how Python has chose its alternatives to tail call elimination.

Kudos to Ian for providing trampoline Python code that functions as a working simplified model of tail call elimination – I think that may go a way in explaining TCE to Python developers. 

Kudos to Joe for demonstrating cool debugging features in MIT scheme – before that example, I hadn’t considered the case that TCE debugging tools could help in debugging my loop iterations as well.

And kudos to everyone else out there that also accomplished more listening than talking.

Join the Conversation

1 Comment

Leave a comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.