Tail call

In computer science, a tail call is a subroutine call performed as the final action of a procedure.
If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.
Tail calls can be implemented without adding a new stack frame to the call stack. Most of the frame of the current procedure is no longer needed, and can be replaced by the frame of the tail call, modified as appropriate. The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail-call elimination or tail-call optimization. Tail-call elimination allows procedure calls in tail position to be implemented as efficiently as goto statements, thus allowing efficient structured programming. In the words of Guy L. Steele, "in general, procedure calls may be usefully thought of as GOTO statements which also pass parameters, and can be uniformly coded as JUMP instructions."
Not all programming languages require tail-call elimination. However, in functional programming languages, tail-call elimination is often guaranteed by the language standard, allowing tail recursion to use a similar amount of memory as an equivalent loop. The special case of tail-recursive calls, when a function calls itself, may be more amenable to call elimination than general tail calls. When the language semantics do not explicitly support general tail calls, a compiler can often still optimize sibling calls, or tail calls to functions which take and return the same types as the caller.

Description

When a function is called, the computer must "remember" the place it was called from, the return address, so that it can return to that location with the result once the call is complete. Typically, this information is saved on the call stack, a list of return locations in the order that the call locations were reached. In addition, compilers allocate memory for local variables of the called function and push register content onto the stack. Typically, it is done by allocating a stack frame including saved registers, space allocated for non-register local variables, return address and call parameters. For tail calls, there is no need to remember the caller or preserve content of registers – instead, tail-call elimination avoids allocation of new stack frames and makes only the minimum necessary changes to the existing stack frame before passing it on, and the tail-called function will return directly to the original caller.
This, however, leads to complete loss of the caller's stack frame, which is sometimes considered as a hindrance in debugging. The tail call doesn't have to appear lexically after all other statements in the source code; it is only important that the calling function return immediately after the tail call, returning the tail call's result if any, since the calling function is bypassed when the optimization is performed.
For non-recursive function calls, this is usually an optimization that saves only a little time and space, since there are not that many different functions available to call. When dealing with recursive or mutually recursive functions where recursion happens through tail calls, however, the stack space and the number of returns saved can grow to be very significant, since a function can call itself, directly or indirectly, creating a new call stack frame each time. Tail-call elimination often reduces asymptotic stack space requirements from linear, or O, to constant, or O. Tail-call elimination is thus required by the standard definitions of some programming languages, such as Scheme, and languages in the ML family among others.
The Scheme language definition formalizes the intuitive notion of tail position exactly, by specifying which syntactic forms allow having results in tail context.
Implementations allowing an unlimited number of tail calls to be active at the same moment, thanks to tail-call elimination, can also be called 'properly tail recursive'.
Besides space and execution efficiency, tail-call elimination is important in the functional programming idiom known as continuation-passing style, which would otherwise quickly run out of stack space.

Syntactic form

A tail call can be located just before the syntactical end of a function.

int a;
int b;
int c;
int foo

Here, both a and b are calls, but b is the last thing the procedure executes before returning and is thus in tail position. However, not all tail calls are necessarily located at the syntactical end of a subroutine:

int bar

Here, both calls to b and c are in tail position. This is because each of them lies in the end of if-branch respectively, even though the first one is not syntactically at the end of bar's body.
Consider this example:

int foo1
int foo2
int foo3

The call to a is in tail position in foo2, but it is not in tail position either in foo1 or in foo3, because control must return to the caller to allow it to inspect or modify the return value before returning it.

Example programs

The following program is an example in Scheme:

;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.

1
))

This is not written in a tail-recursive style, because the multiplication function is in the tail position. This can be compared to:

;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.
)

product

)))

This program assumes applicative-order evaluation. The inner procedure fact-iter calls itself last in the control flow. This allows an interpreter or compiler to reorganize the execution which would ordinarily look like this:
call factorial
call fact-iter
call fact-iter
call fact-iter
call fact-iter
return 24
return 24
return 24
return 24
return 24
into the more efficient variant, in terms of both space and time:
call factorial
call fact-iter
replace arguments with
replace arguments with
replace arguments with
return 24
return 24
This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap, and the call stack frame for fact-iter is reused for the intermediate results storage. This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions. In typical implementations, the tail-recursive variant will be substantially faster than the other variant, but only by a constant factor.
Some programmers working in functional languages will rewrite recursive code to be tail recursive so they can take advantage of this feature. This often requires addition of an "accumulator" argument to the function.

Tail recursion modulo cons

Tail recursion modulo cons is a generalization of tail-recursion optimization introduced by David H. D. Warren in the context of compilation of Prolog, seen as an explicitly ''set once language. It was described by Daniel P. Friedman and David S. Wise in 1974 as a LISP compilation technique. As the name suggests, it applies when the only operation left to perform after a recursive call is to prepend a known value in front of the list returned from it. This call would thus be a tail call save for the said cons operation. But prefixing a value at the start of a list on exit from a recursive call is the same as appending this value at the end of the growing list on entry'' into the recursive call, thus building the list as a side effect, as if in an implicit accumulator parameter. The following Prolog fragment illustrates the concept:

Example code

Thus in tail-recursive translation such a call is transformed into first creating a new list node and setting its first field, and then making the tail call with the pointer to the node's rest field as argument, to be filled recursively. The same effect is achieved when the recursion is guarded under a lazily evaluated data constructor, which is automatically achieved in lazy programming languages like Haskell.

C example

The following fragment defines a recursive function in C that duplicates a linked list :
In this form the function is not tail recursive, because control returns to the caller after the recursive call duplicates the rest of the input list. Even if it were to allocate the head node before duplicating the rest, it would still need to plug in the result of the recursive call into the next field after the call.
So the function is almost tail recursive. Warren's method pushes the responsibility of filling the next field into the recursive call itself, which thus becomes tail call. Using sentinel head node to simplify the code,
The callee now appends to the end of the growing list, rather than have the caller prepend to the beginning of the returned list. The work is now done on the way forward from the list's start, before the recursive call which then proceeds further, instead of backward from the list's end, after the recursive call has returned its result. It is thus similar to the accumulating parameter technique, turning a recursive computation into an iterative one.
Characteristically for this technique, a parent frame is created on the execution call stack, which the tail-recursive callee can reuse as its own call frame if the tail-call optimization is present.
The tail-recursive implementation can now be converted into an explicitly iterative implementation, as an accumulating loop: