Lazy evaluation
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which avoids repeated evaluations.
The benefits of lazy evaluation include:
- The ability to define control flow as abstractions instead of primitives.
- The ability to define potentially infinite data structures. This allows for more straightforward implementation of some algorithms.
- The ability to define partly defined data structures where some elements are errors. This allows for rapid prototyping.
Lazy evaluation is difficult to combine with imperative features such as exception handling and input/output, because the order of operations becomes indeterminate.
The opposite of lazy evaluation is eager evaluation, sometimes known as strict evaluation. Eager evaluation is the evaluation strategy employed in most programming languages.
History
Lazy evaluation was introduced for lambda calculus by Christopher Wadsworth. For programming languages, it was independently introduced by Peter Henderson and James H. Morris and by Daniel P. Friedman and David S. Wise.Applications
Delayed evaluation is used particularly in functional programming languages. When using delayed evaluation, an expression is not evaluated as soon as it gets bound to a variable, but when the evaluator is forced to produce the expression's value. That is, a statement such asx = expression; clearly calls for the expression to be evaluated and the result is placed in x, but what actually is in x is irrelevant until there is a need for its value via a reference to x in some later expression whose evaluation could itself be deferred, though eventually the rapidly growing tree of dependencies would be pruned to produce some symbol rather than another for the outside world to see.Lazy evaluation is fundamental in big data frameworks such as Apache Spark, where computations on distributed datasets are delayed until results are explicitly needed, allowing for execution optimizations and reduction of unnecessary processing.
Control structures
Lazy evaluation allows control structures to be defined normally, and not as primitives or compile-time techniques. For example, one can define if-then-else and short-circuit evaluation operators:ifThenElse True b c = b
ifThenElse False b c = c
-- or
True || b = True
False || b = b
-- and
True && b = b
False && b = False
These have the usual semantics, i.e., evaluates, then if and only if evaluates to true does it evaluate, otherwise it evaluates. That is, exactly one of or will be evaluated. Similarly, for, if the easy part gives True the lots of work expression could be avoided. Finally, when evaluating, if SafeToTry is false there will be no attempt at evaluating the Expression.
Conversely, in an eager language the above definition for would evaluate,, and regardless of the value of. This is not the desired behavior, as or may have side effects, take a long time to compute, or throw errors. It is usually possible to introduce user-defined lazy control structures in eager languages as functions, though they may depart from the language's syntax for eager evaluation: Often the involved code bodies need to be wrapped in a function value, so that they are executed only when called.
Working with infinite data structures
Delayed evaluation has the advantage of being able to create calculable infinite lists without infinite loops or size matters interfering in computation. The actual values are only computed when needed. For example, one could create a function that creates an infinite list of Fibonacci numbers. The calculation of the n-th Fibonacci number would be merely the extraction of that element from the infinite list, forcing the evaluation of only the first n members of the list.Take for example this trivial program in Haskell:
numberFromInfiniteList :: Int -> Int
numberFromInfiniteList n = infinity !! n - 1
where infinity =
main = print $ numberFromInfiniteList 4
In the function, the value of is an infinite range, but until an actual value is needed, the list is not evaluated, and even then, it is only evaluated as needed. Provided the programmer is careful, the program completes normally. However, certain calculations may result in the program attempting to evaluate an infinite number of elements; for example, requesting the length of the list or trying to sum the elements of the list with a fold operation would result in the program either failing to terminate or running out of memory.
As another example, the list of all Fibonacci numbers can be written in the programming language Haskell as:
fibs = 0 : 1 : zipWith fibs
In Haskell syntax, "
:" prepends an element to a list, tail returns a list without its first element, and zipWith uses a specified function to combine corresponding elements of two lists to produce a third.List-of-successes pattern
Other uses
In computer windowing systems, the painting of information to the screen is driven by expose events which drive the display code at the last possible moment. By doing this, windowing systems avoid computing unnecessary display content updates.Another example of laziness in modern computer systems is copy-on-write page allocation or demand paging, where memory is allocated only when a value stored in that memory is changed.
Laziness can be useful for high performance scenarios. An example is the Unix mmap function, which provides demand driven loading of pages from disk, so that only those pages actually touched are loaded into memory, and unneeded memory is not allocated.
MATLAB implements copy on edit, where arrays which are copied have their actual memory storage replicated only when their content is changed, possibly leading to an out of memory error when updating an element afterwards instead of during the copy operation.
Performance
The number of beta reductions to reduce a lambda term with call-by-need is not larger than the number needed by call-by-value or call-by-name reduction. With certain programs the number of steps may be much smaller, for example a specific family of lambda terms using Church numerals take an infinite amount of steps with call-by-value, an exponential number of steps with call-by-name, but only a polynomial number with call-by-need. Call-by-need embodies two optimizations - never repeat work, and never perform unnecessary work.Lazy evaluation can also lead to reduction in memory footprint, since values are created when needed.
In practice, lazy evaluation may cause significant performance issues compared to eager evaluation. For example, on modern computer architectures, delaying a computation and performing it later is slower than performing it immediately. This can be alleviated through strictness analysis. Lazy evaluation can also introduce memory leaks due to unevaluated expressions.
Implementation
Some programming languages delay evaluation of expressions by default, and some others provide functions or special syntax to delay evaluation. In KRC, Miranda, and Haskell, evaluation of function arguments is delayed by default. In many other languages, evaluation can be delayed by explicitly suspending the computation using special syntax or, more generally, by wrapping the expression in a thunk. The object representing such an explicitly delayed evaluation is called a lazy future. Raku uses lazy evaluation of lists, so one can assign infinite lists to variables and use them as arguments to functions, but unlike Haskell and Miranda, Raku does not use lazy evaluation of arithmetic operators and functions by default.Laziness and eagerness
Controlling eagerness in lazy languages
In lazy programming languages such as Haskell, although the default is to evaluate expressions only when they are demanded, it is possible in some cases to make code more eager—or conversely, to make it more lazy again after it has been made more eager. This can be done by explicitly coding something which forces evaluation or avoiding such code. Strict evaluation usually implies eagerness, but they are technically different concepts.However, there is an optimisation implemented in some compilers called strictness analysis, which, in some cases, allows the compiler to infer that a value will always be used. In such cases, this may render the programmer's choice of whether to force that particular value or not, irrelevant, because strictness analysis will force strict evaluation.
In Haskell, marking constructor fields strict means that their values will always be demanded immediately. The
seq function can also be used to demand a value immediately and then pass it on, which is useful if a constructor field should generally be lazy. However, neither of these techniques implements recursive strictness—for that, a function called deepSeq was invented.Also, pattern matching in Haskell 98 is strict by default, so the
~ qualifier has to be used to make it lazy.