Exception handling (programming)

In computer programming, several programming language mechanisms exist for exception handling. The term exception is typically used to denote a data structure storing information about an exceptional condition. One mechanism to transfer control, or raise an exception, is known as a throw; the exception is said to be thrown. Execution is transferred to a catch.

Usage

Programming languages differ substantially in their notion of what an exception is. Exceptions can be used to represent and handle abnormal, unpredictable, erroneous situations, but also as flow control structures to handle normal situations. For example, Python's iterators throw StopIteration exceptions to signal that there are no further items produced by the iterator. There is disagreement within many languages as to what constitutes idiomatic usage of exceptions. For example, Joshua Bloch states that Java's exceptions should only be used for exceptional situations, but Kiniry observes that Java's class is not at all an exceptional event. Similarly, Bjarne Stroustrup, author of C++, states that C++ exceptions should only be used for error handling, as this is what they were designed for, but Kiniry observes that many modern languages such as Ada, C++,
Modula-3, ML and OCaml, Python, and Ruby use exceptions for flow control. Some languages such as Eiffel, C#, Common Lisp, and Modula-2 have made a concerted effort to restrict their usage of exceptions, although this is done on a social rather than technical level.

History

The earliest IBM Fortran compilers had statements for testing exceptional conditions. These included the IF ACCUMULATOR OVERFLOW, IF QUOTIENT OVERFLOW, and IF DIVIDE CHECK statements. In the interest of machine independence, they were not included in FORTRAN IV nor the Fortran 66 Standard. However since Fortran 2003 it is possible to test for numerical issues via calls to functions in the IEEE_EXCEPTIONS module.
Software exception handling continued to be developed in the 1960s and 1970s. LISP 1.5 allowed exceptions to be raised by the ERROR pseudo-function, similarly to errors raised by the interpreter or compiler. Exceptions were caught by the ERRORSET keyword, which returned NIL in case of an error, instead of terminating the program or entering the debugger.
PL/I introduced its own form of exception handling circa 1964, allowing interrupts to be handled with ON units.
MacLisp observed that ERRSET and ERR were used not only for error raising, but for non-local control flow, and thus added two new keywords, CATCH and THROW. The cleanup behavior now generally called "finally" was introduced in NIL in the mid- to late-1970s as UNWIND-PROTECT. This was then adopted by Common Lisp. Contemporary with this was dynamic-wind in Scheme, which handled exceptions in closures. The first papers on structured exception handling were and. Exception handling was subsequently widely adopted by many programming languages from the 1980s onward.

Syntax

Many computer languages have built-in syntactic support for exceptions and exception handling. This includes ActionScript, Ada, BlitzMax, C++, C#, Clojure, COBOL, D, ECMAScript, Eiffel, Java, ML, Object Pascal, PowerBuilder, Objective-C, OCaml, Perl, PHP, PL/I, PL/SQL, Prolog, Python, REALbasic, Ruby, Scala, Smalltalk, Tcl, Visual Prolog and most.NET languages.
Excluding minor syntactic differences, there are only a couple of exception handling styles in use. In the most popular style, an exception is initiated by a special statement with an exception object or a value of a special extendable enumerated type. The scope for exception handlers starts with a marker clause and ends in the start of the first handler clause. Several handler clauses can follow, and each can specify which exception types it handles and what name it uses for the exception object. As a minor variation, some languages use a single handler clause, which deals with the class of the exception internally.
Also common is a related clause that is executed whether an exception occurred or not, typically to release resources acquired within the body of the exception-handling block. Notably, C++ does not provide this construct, recommending instead the Resource Acquisition Is Initialization technique which frees resources using destructors. According to a 2008 paper by Westley Weimer and George Necula, the syntax of the try...finally blocks in Java is a contributing factor to software defects. When a method needs to handle the acquisition and release of 3–5 resources, programmers are apparently unwilling to nest enough blocks due to readability concerns, even when this would be a correct solution. It is possible to use a single try...finally block even when dealing with multiple resources, but that requires a correct use of sentinel values, which is another common source of bugs for this type of problem.
Python and Ruby also permit a clause that is used in case no exception occurred before the end of the handler's scope was reached.
In its whole, exception handling code might look like this :

import java.io.IOException;
import java.util.Scanner;
try catch catch finally

C does not have try-catch exception handling, but uses return codes for error checking. The setjmp and longjmp standard library functions can be used to implement try-catch handling via macros.
Perl 5 uses die for throw and for try-catch. It has CPAN modules that offer try-catch semantics.

Termination and resumption semantics

When an exception is thrown, the program searches back through the stack of function calls until an exception handler is found. Some languages call for unwinding the stack as this search progresses. That is, if function, containing a handler for exception, calls function, which in turn calls function, and an exception occurs in, then functions and may be terminated, and in will handle. This is said to be termination semantics.
Alternately, the exception handling mechanisms may not unwind the stack on entry to an exception handler, giving the exception handler the option to restart the computation, resume or unwind. This allows the program to continue the computation at exactly the same place where the error occurred or to implement notifications, logging, queries and fluid variables on top of the exception handling mechanism. Allowing the computation to resume where it left off is termed resumption semantics.
There are theoretical and design arguments in favor of either decision. C++ standardization discussions in 1989–1991 resulted in a definitive decision to use termination semantics in C++. Bjarne Stroustrup cites a presentation by Jim Mitchell as a key data point:
Exception-handling languages with resumption include Common Lisp with its [|Condition System], PL/I, Dylan, R, and Smalltalk. However, the majority of newer programming languages follow C++ and use termination semantics.
C++ offers std::uncaught_exception for detecting whether stack unwinding is occurring, and std::uncaught_exceptions which counts the number of exceptions in the current thread that have been thrown/rethrown and not yet entered a matching catch block.

Exception handling implementation

The implementation of exception handling in programming languages typically involves a fair amount of support from both a code generator and the runtime system accompanying a compiler. Two schemes are most common. The first, , generates code that continually updates structures about the program state in terms of exception handling. Typically, this adds a new element to the stack frame layout that knows what handlers are available for the function or method associated with that frame; if an exception is thrown, a pointer in the layout directs the runtime to the appropriate handler code. This approach is compact in terms of space, but adds execution overhead on frame entry and exit. It was commonly used in many Ada implementations, for example, where complex generation and runtime support was already needed for many other language features. Microsoft's 32-bit Structured Exception Handling uses this approach with a separate exception stack. Dynamic registration, being fairly straightforward to define, is amenable to proof of correctness.
The second scheme, and the one implemented in many production-quality C++ compilers and 64-bit Microsoft SEH, is a. This creates static tables at compile time and link time that relate ranges of the program counter to the program state with respect to exception handling. Then, if an exception is thrown, the runtime system looks up the current instruction location in the tables and determines what handlers are in play and what needs to be done. This approach minimizes executive overhead for the case where an exception is not thrown. This happens at the cost of some space, but this space can be allocated into read-only, special-purpose data sections that are not loaded or relocated until an exception is actually thrown. The location of the code for handling an exception need not be located within the region of memory where the rest of the function's code is stored. So if an exception is thrown then a performance hit – roughly comparable to a function call – may occur if the necessary exception handling code needs to be loaded/cached. However, this scheme has minimal performance cost if no exception is thrown. Since exceptions in C++ are supposed to be exceptional events, the phrase "zero-cost exceptions" is sometimes used to describe exception handling in C++. Like runtime type identification, exceptions might not adhere to C++'s as implementing exception handling at run-time requires a non-zero amount of memory for the lookup table. For this reason, exception handling can be disabled in many C++ compilers, which may be useful for systems with very limited memory. This second approach is also superior in terms of achieving thread safety.
In comparison to C++ where any type may be thrown and caught, in Java only types extending Throwable can be thrown and caught, and Throwable has two direct descendants: Error, and Exception. Error is typically reserved for extremely serious problems beyond the scope of the program, such as OutOfMemoryError, ThreadDeath, or VirtualMachineError.
Other definitional and implementation schemes have been proposed as well. For languages that support metaprogramming, approaches that involve no overhead at all have been advanced.