Memoization
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs. It works by storing the results of expensive calls to pure functions, so that these results can be returned quickly should the same inputs occur again. It is a type of caching, normally implemented using a hash table, and a typical example of a space–time tradeoff, where the runtime of a program is reduced by increasing its memory usage. Memoization can be implemented in any programming language, though some languages have built-in support that make it easy for the programmer to memoize a function, and others memoize certain functions by default.
Memoization has also been used in other contexts, such as in simple mutually recursive descent parsing. In the context of some logic programming languages, memoization is also known as tabling.
Etymology
The term memoization was coined by Donald Michie in 1968 and is derived from the Latin word memorandum, usually truncated as memo in American English, and thus carries the meaning of 'turning a function into something to be remembered'. While memoization might be confused with memorization, memoization has a specialized meaning in computing.Overview
A memoized function "remembers" the results corresponding to some set of specific inputs. Subsequent calls with remembered inputs return the remembered result rather than recalculating it, thus eliminating the primary cost of a call with given parameters from all but the first call made to the function with those parameters. The set of remembered associations may be a fixed-size set controlled by a replacement algorithm or a fixed set, depending on the nature of the function and its use. A function can only be memoized if it is referentially transparent; that is, only if calling the function has exactly the same effect as replacing that function call with its return value. While related to lookup tables, since memoization often uses such tables in its implementation, memoization populates its cache of results transparently on the fly, as needed, rather than in advance.Memoized functions are optimized for speed in exchange for a higher use of computer memory space. The time/space "cost" of algorithms has a specific name in computing: computational complexity. All functions have a computational complexity in time and in space.
Although a space–time tradeoff occurs, this differs from some other optimizations that involve time-space trade-off, such as strength reduction, in that memoization is a run-time rather than compile-time optimization. Moreover, strength reduction potentially replaces a costly operation such as multiplication with a less costly operation such as addition, and the results in savings can be highly machine-dependent, whereas memoization is a more machine-independent, cross-platform strategy.
Consider the following pseudocode function to calculate the factorial of n:
function factorial
if n is 0 then
return 1
else
return factorial times n
end if
end function
For every integer n such that
n ≥ 0, the final result of the function factorial is invariant; if invoked as x = factorial, the result is such that x will always be assigned the value 6. The non-memoized implementation above, given the nature of the recursive algorithm involved, would require n + 1 invocations of factorial to arrive at a result, and each of these invocations, in turn, has an associated cost in the time it takes the function to return the value computed. Depending on the machine, this cost might be the sum of:- The cost to set up the functional call stack frame.
- The cost to compare n to 0.
- The cost to subtract 1 from n.
- The cost to set up the recursive call stack frame.
- The cost to multiply the result of the recursive call to
factorialby n. - The cost to store the return result so that it may be used by the calling context.
factorial includes the cumulative cost of steps 2 through 6 proportional to the initial value of n.A memoized version of the
factorial function follows:function factorial
if n is 0 then
return 1
else if n is in lookup-table then
return lookup-table-value-for-n
else
let x = factorial times n
store x in lookup-table in the nth slot
return x
end if
end function
In this particular example, if
factorial is first invoked with 5, and then invoked later with any value less than or equal to five, those return values will also have been memoized, since factorial will have been called recursively with the values 5, 4, 3, 2, 1, and 0, and the return values for each of those will have been stored. If it is then called with a number greater than 5, such as 7, only 2 recursive calls will be made, and the value for 5! will have been stored from the previous call. In this way, memoization allows a function to become more time-efficient the more often it is called, thus resulting in eventual overall speed-up.An extreme example of memoization is the Singleton pattern, specifically the implementation of its getter — a function that creates an object upon the first invocation, caches the instance, and returns the same object on all subsequent invocations.
Other considerations
Functional programming
Memoization is heavily used in compilers for functional programming languages, which often use call by name evaluation strategy. To avoid overhead with calculating argument values, compilers for these languages heavily use auxiliary functions called thunks to compute the argument values, and memoize these functions to avoid repeated calculations.Automatic memoization
While memoization may be added to functions internally and explicitly by a computer programmer in much the same way the above memoized version offactorial is implemented, referentially transparent functions may also be automatically memoized externally. The techniques employed by Peter Norvig have application not only in Common Lisp, but also in various other programming languages. Applications of automatic memoization have also been formally explored in the study of term rewriting and artificial intelligence.In programming languages where functions are first-class objects, automatic memoization can be implemented by replacing a function with its calculated value once a value has been calculated for a given set of parameters. The function that does this value-for-function-object replacement can generically wrap any referentially transparent function. Consider the following pseudocode :
function memoized-call
if F has no attached array values then
allocate an associative array called values;
attach values to F;
end if;
if F.''values is empty then
F''.values = F;
end if;
return F.values;
end function
In order to call an automatically memoized version of
factorial using the above strategy, rather than calling factorial directly, code invokes memoized-call. Each such call first checks to see if a holder array has been allocated to store results, and if not, attaches that array. If no entry exists at the position values, a real call is made to factorial with the supplied arguments. Finally, the entry in the array at the key position is returned to the caller.The above strategy requires explicit wrapping at each call to a function that is to be memoized. In those languages that allow closures, memoization can be effected implicitly via a functor factory that returns a wrapped memoized function object in a decorator pattern. In pseudocode, this can be expressed as follows:
function construct-memoized-functor
allocate a function object called memoized-version;
let memoized-version be
if self has no attached array values then
allocate an associative array called values;
attach values to self;
end if;
if self.values is empty then
self.values = F;
end if;
return self.values;
end let;
return memoized-version;
end function
Rather than call
factorial, a new function object memfact is created as follows:memfact = construct-memoized-functor
The above example assumes that the function
factorial has already been defined before the call to construct-memoized-functor is made. From this point forward, memfact is called whenever the factorial of n is desired. In languages such as Lua, more sophisticated techniques exist which allow a function to be replaced by a new function with the same name, which would permit:factorial = construct-memoized-functor
Essentially, such techniques involve attaching the original function object to the created functor and forwarding calls to the original function being memoized via an alias when a call to the actual function is required, as illustrated below:
function construct-memoized-functor
allocate a function object called memoized-version;
let memoized-version be
if self has no attached array values then
allocate an associative array called values;
attach values to self;
allocate a new function object called alias;
attach alias to self;
self.alias = F;
end if;
if self.values is empty then
self.values = self.alias;
end if;
return self.values;
end let;
return memoized-version;
end function