Exception handling (programming)


In computer programming, several programming language mechanisms exist for exception handling. The term exception is typically used to denote a data structure storing information about an exceptional condition. One mechanism to transfer control, or raise an exception, is known as a throw; the exception is said to be thrown. Execution is transferred to a catch.

Usage

Programming languages differ substantially in their notion of what an exception is. Exceptions can be used to represent and handle abnormal, unpredictable, erroneous situations, but also as flow control structures to handle normal situations. For example, Python's iterators throw StopIteration exceptions to signal that there are no further items produced by the iterator. There is disagreement within many languages as to what constitutes idiomatic usage of exceptions. For example, Joshua Bloch states that Java's exceptions should only be used for exceptional situations, but Kiniry observes that Java's class is not at all an exceptional event. Similarly, Bjarne Stroustrup, author of C++, states that C++ exceptions should only be used for error handling, as this is what they were designed for, but Kiniry observes that many modern languages such as Ada, C++,
Modula-3, ML and OCaml, Python, and Ruby use exceptions for flow control. Some languages such as Eiffel, C#, Common Lisp, and Modula-2 have made a concerted effort to restrict their usage of exceptions, although this is done on a social rather than technical level.

History

The earliest IBM Fortran compilers had statements for testing exceptional conditions. These included the IF ACCUMULATOR OVERFLOW, IF QUOTIENT OVERFLOW, and IF DIVIDE CHECK statements. In the interest of machine independence, they were not included in FORTRAN IV nor the Fortran 66 Standard. However since Fortran 2003 it is possible to test for numerical issues via calls to functions in the IEEE_EXCEPTIONS module.
Software exception handling continued to be developed in the 1960s and 1970s. LISP 1.5 allowed exceptions to be raised by the ERROR pseudo-function, similarly to errors raised by the interpreter or compiler. Exceptions were caught by the ERRORSET keyword, which returned NIL in case of an error, instead of terminating the program or entering the debugger.
PL/I introduced its own form of exception handling circa 1964, allowing interrupts to be handled with ON units.
MacLisp observed that ERRSET and ERR were used not only for error raising, but for non-local control flow, and thus added two new keywords, CATCH and THROW. The cleanup behavior now generally called "finally" was introduced in NIL in the mid- to late-1970s as UNWIND-PROTECT. This was then adopted by Common Lisp. Contemporary with this was dynamic-wind in Scheme, which handled exceptions in closures. The first papers on structured exception handling were and. Exception handling was subsequently widely adopted by many programming languages from the 1980s onward.

Syntax

Many computer languages have built-in syntactic support for exceptions and exception handling. This includes ActionScript, Ada, BlitzMax, C++, C#, Clojure, COBOL, D, ECMAScript, Eiffel, Java, ML, Object Pascal, PowerBuilder, Objective-C, OCaml, Perl, PHP, PL/I, PL/SQL, Prolog, Python, REALbasic, Ruby, Scala, Smalltalk, Tcl, Visual Prolog and most .NET languages.
Excluding minor syntactic differences, there are only a couple of exception handling styles in use. In the most popular style, an exception is initiated by a special statement with an exception object or a value of a special extendable enumerated type. The scope for exception handlers starts with a marker clause and ends in the start of the first handler clause. Several handler clauses can follow, and each can specify which exception types it handles and what name it uses for the exception object. As a minor variation, some languages use a single handler clause, which deals with the class of the exception internally.
Also common is a related clause that is executed whether an exception occurred or not, typically to release resources acquired within the body of the exception-handling block. Notably, C++ does not provide this construct, recommending instead the Resource Acquisition Is Initialization technique which frees resources using destructors. According to a 2008 paper by Westley Weimer and George Necula, the syntax of the try...finally blocks in Java is a contributing factor to software defects. When a method needs to handle the acquisition and release of 3–5 resources, programmers are apparently unwilling to nest enough blocks due to readability concerns, even when this would be a correct solution. It is possible to use a single try...finally block even when dealing with multiple resources, but that requires a correct use of sentinel values, which is another common source of bugs for this type of problem.
Python and Ruby also permit a clause that is used in case no exception occurred before the end of the handler's scope was reached.
In its whole, exception handling code might look like this :

import java.io.IOException;
import java.util.Scanner;
try catch catch finally

C does not have try-catch exception handling, but uses return codes for error checking. The setjmp and longjmp standard library functions can be used to implement try-catch handling via macros.
Perl 5 uses die for throw and for try-catch. It has CPAN modules that offer try-catch semantics.

Termination and resumption semantics

When an exception is thrown, the program searches back through the stack of function calls until an exception handler is found. Some languages call for unwinding the stack as this search progresses. That is, if function, containing a handler for exception, calls function, which in turn calls function, and an exception occurs in, then functions and may be terminated, and in will handle. This is said to be termination semantics.
Alternately, the exception handling mechanisms may not unwind the stack on entry to an exception handler, giving the exception handler the option to restart the computation, resume or unwind. This allows the program to continue the computation at exactly the same place where the error occurred or to implement notifications, logging, queries and fluid variables on top of the exception handling mechanism. Allowing the computation to resume where it left off is termed resumption semantics.
There are theoretical and design arguments in favor of either decision. C++ standardization discussions in 1989–1991 resulted in a definitive decision to use termination semantics in C++. Bjarne Stroustrup cites a presentation by Jim Mitchell as a key data point:
Exception-handling languages with resumption include Common Lisp with its Condition System, PL/I, Dylan, R, and Smalltalk. However, the majority of newer programming languages follow C++ and use termination semantics.
C++ offers std::uncaught_exception for detecting whether stack unwinding is occurring, and std::uncaught_exceptions which counts the number of exceptions in the current thread that have been thrown/rethrown and not yet entered a matching catch block.

Exception handling implementation

The implementation of exception handling in programming languages typically involves a fair amount of support from both a code generator and the runtime system accompanying a compiler. Two schemes are most common. The first, , generates code that continually updates structures about the program state in terms of exception handling. Typically, this adds a new element to the stack frame layout that knows what handlers are available for the function or method associated with that frame; if an exception is thrown, a pointer in the layout directs the runtime to the appropriate handler code. This approach is compact in terms of space, but adds execution overhead on frame entry and exit. It was commonly used in many Ada implementations, for example, where complex generation and runtime support was already needed for many other language features. Microsoft's 32-bit Structured Exception Handling uses this approach with a separate exception stack. Dynamic registration, being fairly straightforward to define, is amenable to proof of correctness.
The second scheme, and the one implemented in many production-quality C++ compilers and 64-bit Microsoft SEH, is a. This creates static tables at compile time and link time that relate ranges of the program counter to the program state with respect to exception handling. Then, if an exception is thrown, the runtime system looks up the current instruction location in the tables and determines what handlers are in play and what needs to be done. This approach minimizes executive overhead for the case where an exception is not thrown. This happens at the cost of some space, but this space can be allocated into read-only, special-purpose data sections that are not loaded or relocated until an exception is actually thrown. The location of the code for handling an exception need not be located within the region of memory where the rest of the function's code is stored. So if an exception is thrown then a performance hit – roughly comparable to a function call – may occur if the necessary exception handling code needs to be loaded/cached. However, this scheme has minimal performance cost if no exception is thrown. Since exceptions in C++ are supposed to be exceptional events, the phrase "zero-cost exceptions" is sometimes used to describe exception handling in C++. Like runtime type identification, exceptions might not adhere to C++'s as implementing exception handling at run-time requires a non-zero amount of memory for the lookup table. For this reason, exception handling can be disabled in many C++ compilers, which may be useful for systems with very limited memory. This second approach is also superior in terms of achieving thread safety.
In comparison to C++ where any type may be thrown and caught, in Java only types extending Throwable can be thrown and caught, and Throwable has two direct descendants: Error, and Exception. Error is typically reserved for extremely serious problems beyond the scope of the program, such as OutOfMemoryError, ThreadDeath, or VirtualMachineError.
Other definitional and implementation schemes have been proposed as well. For languages that support metaprogramming, approaches that involve no overhead at all have been advanced.

Exception handling based on design by contract

A different view of exceptions is based on the principles of design by contract and is supported in particular by the Eiffel language. The idea is to provide a more rigorous basis for exception handling by defining precisely what is "normal" and "abnormal" behavior. Specifically, the approach is based on two concepts:
  • Failure: the inability of an operation to fulfill its contract. For example, an addition may produce an arithmetic overflow ; or a routine may fail to meet its postcondition.
  • Exception: an abnormal event occurring during the execution of a routine during its execution. Such an abnormal event results from the failure of an operation called by the routine.
The "Safe Exception Handling principle" as introduced by Bertrand Meyer in Object-Oriented Software Construction then holds that there are only two meaningful ways a routine can react when an exception occurs:
  • Failure, or "organized panic": The routine fixes the object's state by re-establishing the invariant, and then fails, triggering an exception in its caller.
  • Retry: The routine tries the algorithm again, usually after changing some values so that the next attempt will have a better chance to succeed.
In particular, simply ignoring an exception is not permitted; a block must either be retried and successfully complete, or propagate the exception to its caller.
Here is an example expressed in Eiffel syntax. It assumes that a routine is normally the better way to send a message, but it may fail, triggering an exception; if so, the algorithm next uses, which will fail less often. If fails, the routine as a whole should fail, causing the caller to get an exception.

send is
-- Send m through fast link, if possible, otherwise through slow link.
local
tried_fast, tried_slow: BOOLEAN
do
if tried_fast then
tried_slow := True
send_slow
else
tried_fast := True
send_fast
end
rescue
if not tried_slow then
retry
end
end

The boolean local variables are initialized to False at the start. If fails, the body will be executed again, causing execution of. If this execution of fails, the clause will execute to the end with no , causing the routine execution as a whole to fail.
This approach has the merit of defining clearly what "normal" and "abnormal" cases are: an abnormal case, causing an exception, is one in which the routine is unable to fulfill its contract. It defines a clear distribution of roles: the clause is in charge of achieving, or attempting to achieve, the routine's contract; the clause is in charge of reestablishing the context and restarting the process, if this has a chance of succeeding, but not of performing any actual computation.
Although exceptions in Eiffel have a fairly clear philosophy, Kiniry criticizes their implementation because "Exceptions that are part of the language definition are represented by INTEGER values, developer-defined exceptions by STRING values. Additionally, because they are basic values and not objects, they have no inherent semantics beyond that which is expressed in a helper routine which necessarily cannot be foolproof because of the representation overloading in effect."
C++26 adds support for contracts, which are used as follows.

int f
pre // a precondition assertion
post // a postcondition assertion; r names the result object of f

Uncaught exceptions

Contemporary applications face many design challenges when considering exception handling strategies. Particularly in modern enterprise level applications, exceptions must often cross process boundaries and machine boundaries. Part of designing a solid exception handling strategy is recognizing when a process has failed to the point where it cannot be economically handled by the software portion of the process.
If an exception is thrown and not caught, the uncaught exception is handled by the runtime; the routine that does this is called the . The most common default behavior is to terminate the program and print an error message to the console, usually including debug information such as a string representation of the exception and the stack trace. This is often avoided by having a top-level handler that catches exceptions before they reach the runtime.
Note that even though an uncaught exception may result in the program terminating abnormally, the process terminates normally, as the runtime can ensure orderly shutdown of the process.
In a multithreaded program, an uncaught exception in a thread may instead result in termination of just that thread, not the entire process. This is particularly important for servers, where for example a servlet can be terminated without the server overall being affected.
This default uncaught exception handler may be overridden, either globally or per-thread, for example to provide alternative logging or end-user reporting of uncaught exceptions, or to restart threads that terminate due to an uncaught exception. For example, in Java this is done for a single thread via and globally via ; in Python this is done by modifying .

Checked exceptions

introduced the notion of checked exceptions, which are special classes of exceptions. In Java, a checked exception specifically is any Exception that does not extend RuntimeException. The checked exceptions that a method may raise must be part of the method's signature. For instance, if a method might throw a java.io.IOException, it must declare this fact explicitly in its method signature. Failure to do so raises a compile-time error. This would be declared like so :

import java.io.IOException;
import java.util.zip.DataFormatException;
// Indicates that IOException and DataFormatException may be thrown
public void operateOnFile throws IOException, DataFormatException

According to Hanspeter Mössenböck, checked exceptions are less convenient but more robust. Checked exceptions can, at compile time, reduce the incidence of unhandled exceptions surfacing at runtime in a given application.
Kiniry writes that "As any Java programmer knows, the volume of try catch code in a typical Java application is sometimes larger than the comparable code necessary for explicit formal parameter and return value checking in other languages that do not have checked exceptions. In fact, the general consensus among in-the-trenches Java programmers is that dealing with checked exceptions is nearly as unpleasant a task as writing documentation. Thus, many programmers report that they “resent” checked exceptions.". Martin Fowler has written "...on the whole I think that exceptions are good, but Java checked exceptions are more trouble than they are worth." As of 2006 no major programming language has followed Java in adding checked exceptions. For example, C# does not require or allow declaration of any exception specifications, with the following posted by Eric Gunnerson:
Anders Hejlsberg describes two concerns with checked exceptions:
  • Versioning: A method may be declared to throw exceptions X and Y. In a later version of the code, one cannot throw exception Z from the method, because it would make the new code incompatible with the earlier uses. Checked exceptions require the method's callers to either add Z to their throws clause or handle the exception. Alternately, Z may be misrepresented as an X or a Y.
  • Scalability: In a hierarchical design, each systems may have several subsystems. Each subsystem may throw several exceptions. Each parent system must deal with the exceptions of all subsystems below it, resulting in an exponential number of exceptions to be dealt with. Checked exceptions require all of these exceptions to be dealt with explicitly.
To work around these, Hejlsberg says programmers resort to circumventing the feature by using a declaration. Another circumvention is to use a try catch handler. This is referred to as catch-all exception handling or Pokémon exception handling after the show's catchphrase "Gotta Catch 'Em All!". The Java Tutorials discourage catch-all exception handling as it may catch exceptions "for which the handler was not intended". Still another discouraged circumvention is to make all exceptions subclass, thus making the exception unchecked. An encouraged solution is to use a catch-all handler or throws clause but with a specific superclass of all potentially thrown exceptions rather than the general superclass. Another encouraged solution is to define and declare exception types that are suitable for the level of abstraction of the called method and map lower level exceptions to these types by using exception chaining.

Similar mechanisms

The roots of checked exceptions go back to the CLU programming language's notion of exception specification. A function could raise only exceptions listed in its type, but any leaking exceptions from called functions would automatically be turned into the sole runtime exception,, instead of resulting in compile-time error. Later, Modula-3 had a similar feature. These features don't include the compile time checking that is central in the concept of checked exceptions.
Early versions of the C++ programming language included an optional mechanism similar to checked exceptions, called exception specifications. By default any function could throw any exception, but this could be limited by a clause added to the function signature, that specified which exceptions the function may throw. For example, this code was valid C++03:

  1. include
using std::domain_error;
using std::invalid_argument;
// this could be similar to the Java signature
// void performSomeOperation throws InvalidArgumentException, ArithmeticException;
void performSomeOperation throw

C++ throw clauses could specify any number of any types, even primitives and classes that did not extend std::exception. If no type was named in the throw clause, it would indicate that the function would not throw at all.
Exception specifications were not enforced at compile-time. Violations resulted in the global function being called. An empty exception specification could be given, which indicated that the function will throw no exception. This was not made the default when exception handling was added to the language because it would have required too much modification of existing code, would have impeded interaction with code written in other languages, and would have tempted programmers into writing too many handlers at the local level. Explicit use of empty exception specifications could, however, allow C++ compilers to perform significant code and stack layout optimizations that are precluded when exception handling may take place in a function. Some analysts viewed the proper use of exception specifications in C++ as difficult to achieve. This use of exception specifications was included in C++98 and C++03, deprecated in the 2012 C++ language standard, and was removed from the language in C++17. Throws clauses were replaced by clauses. A function that will not throw any exceptions would now be denoted by the keyword, and instead specified that a function will throw. Although throw clauses are removed from the language, writing only throw in the signature is legal and is equivalent to noexcept. For transitioning a codebase that uses throw clauses, the removed throw can be redefined as a macro. This is only a temporary fix to allow the code to compile, rather than an actual implementation of checked exceptions. Using this, one can similarly imitate Java throws clauses.

  1. define THROW_IMPL_0 noexcept
  2. define THROW_IMPL_1 noexcept
  3. define THROW_SELECT NAME
  4. define THROW_CHOOSE THROW_SELECT
// if throw is empty, it expands to noexcept,
// otherwise it expands to noexcept
  1. define throw THROW_CHOOSE
  2. define throws throw

One can also specify that a function is noexcept conditionally on another function being noexcept, like so:

void mightThrow;
// The first noexcept is the noexcept clause, the second is the noexcept operator which evaluates to a Boolean value
void f noexcept));

Though C++ has no checked exceptions, one can propagate the thrown object up the stack when inside a catch block, by writing . This re-throws the caught object. This allows operations to be done within the catch block that catches it, before choosing to allow the object to continue propagating upwards.
An uncaught exceptions analyzer exists for the OCaml programming language. The tool reports the set of raised exceptions as an extended type signature. But, unlike checked exceptions, the tool does not require any syntactic annotations and is external.
In C++, one can also perform "Pokémon exception handling". Like catch in Java, C++ supports a catch block, which will catch any thrown object. However, catch has the disadvantage of not naming the caught object, which means it cannot be referred to. This is because in languages like Java, only classes which extend java.lang.Throwable may be thrown, while in C++ any type may be thrown, and thus there is no guaranteeably safe way to store a reference to the caught object of a catch-all block.

import std;
using std::exception;
// Catching only exceptions:
try catch catch

The Rust language, instead of using exceptions altogether, represents recoverable exceptions as result types. This is represented as Result. The advantage of result types over checked exceptions is that while both result types and checked exceptions force users to immediately handle errors, they can also be directly represented as a return type within the language's type system, unlike checked exceptions where the declared potentially thrown exception is part of the function signature but not directly part of its return type.

Dynamic checking of exceptions

The point of exception handling routines is to ensure that the code can handle error conditions. In order to establish that exception handling routines are sufficiently robust, it is necessary to present the code with a wide spectrum of invalid or unexpected inputs, such as can be created via software fault injection and mutation testing. One of the most difficult types of software for which to write exception handling routines is protocol software, since a robust protocol implementation must be prepared to receive input that does not comply with the relevant specification.
In order to ensure that meaningful regression analysis can be conducted throughout a software development lifecycle process, any exception handling testing should be highly automated, and the test cases must be generated in a scientific, repeatable fashion. Several commercially available systems exist that perform such testing.
In runtime engine environments such as Java or .NET, there exist tools that attach to the runtime engine and every time that an exception of interest occurs, they record debugging information that existed in memory at the time the exception was thrown. These tools are called automated exception handling or error interception tools and provide 'root-cause' information for exceptions.

Asynchronous exceptions

Asynchronous exceptions are events raised by a separate thread or external process, such as pressing Ctrl-C to interrupt a program, receiving a signal, or sending a disruptive message such as "stop" or "suspend" from another thread of execution. Whereas synchronous exceptions happen at a specific throw statement, asynchronous exceptions can be raised at any time. It follows that asynchronous exception handling can't be optimized out by the compiler, as it cannot prove the absence of asynchronous exceptions. They are also difficult to program with correctly, as asynchronous exceptions must be blocked during cleanup operations to avoid resource leaks.
Programming languages typically avoid or restrict asynchronous exception handling, for example C++ forbids raising exceptions from signal handlers, and Java has deprecated the use of its ThreadDeath error in Java 20 that was used to allow one thread to stop another one. Another feature is a semi-asynchronous mechanism that raises an asynchronous exception only during certain operations of the program. For example, Java's only affects the thread when the thread calls an operation that throws. The similar POSIX API has race conditions which make it impossible to use safely.

Condition systems

, R, Dylan and Smalltalk have a condition system that encompasses the aforementioned exception handling systems. In those languages or environments the advent of a condition implies a function call, and only late in the exception handler the decision to unwind the stack may be taken.
Conditions are a generalization of exceptions. When a condition arises, an appropriate condition handler is searched for and selected, in stack order, to handle the condition. Conditions that do not represent errors may safely go unhandled entirely; their only purpose may be to propagate hints or warnings toward the user.

Continuable exceptions

This is related to the so-called resumption model of exception handling, in which some exceptions are said to be continuable: it is permitted to return to the expression that signaled an exception, after having taken corrective action in the handler. The condition system is generalized thus: within the handler of a non-serious condition, it is possible to jump to predefined restart points that lie between the signaling expression and the condition handler. Restarts are functions closed over some lexical environment, allowing the programmer to repair this environment before exiting the condition handler completely or unwinding the stack even partially.
An example is the ENDPAGE condition in PL/I; the ON unit might write page trailer lines and header lines for the next page, then fall through to resume execution of the interrupted code.

Restarts separate mechanism from policy

Condition handling moreover provides a separation of mechanism and policy. Restarts provide various possible mechanisms for recovering from error, but do not select which mechanism is appropriate in a given situation. That is the province of the condition handler, which has access to a broader view.
An example: Suppose there is a library function whose purpose is to parse a single syslog file entry. What should this function do if the entry is malformed? There is no one right answer, because the same library could be deployed in programs for many different purposes. In an interactive log-file browser, the right thing to do might be to return the entry unparsed, so the user can see it—but in an automated log-summarizing program, the right thing to do might be to supply null values for the unreadable fields, but abort with an error, if too many entries have been malformed.
That is to say, the question can only be answered in terms of the broader goals of the program, which are not known to the general-purpose library function. Nonetheless, exiting with an error message is only rarely the right answer. So instead of simply exiting with an error, the function may establish restarts offering various ways to continue—for instance, to skip the log entry, to supply default or null values for the unreadable fields, to ask the user for the missing values, or to unwind the stack and abort processing with an error message. The restarts offered constitute the mechanisms available for recovering from error; the selection of restart by the condition handler supplies the policy.

Criticism

Exception handling is often not handled correctly in software, especially when there are multiple sources of exceptions; data flow analysis of 5 million lines of Java code found over 1300 exception handling defects.
Citing multiple prior studies by others and their own results, Weimer and Necula wrote that a significant problem with exceptions is that they "create hidden control-flow paths that are difficult for programmers to reason about". "While try-catch-finally is conceptually simple, it has the most complicated execution description in the language specification and requires four levels of nested “if”s in its official English description. In short, it contains a large number of corner cases that programmers often overlook."
Exceptions, as unstructured flow, increase the risk of resource leaks or inconsistent state. There are various techniques for resource management in the presence of exceptions, most commonly combining the dispose pattern with some form of unwind protection, which automatically releases the resource when control exits a section of code.
Tony Hoare in 1980 described the Ada programming language as having "...a plethora of features and notational conventions, many of them unnecessary and some of them, like exception handling, even dangerous. Do not allow this language in its present state to be used in applications where reliability is critical . The next rocket to go astray as a result of a programming language error may not be an exploratory space rocket on a harmless trip to Venus: It may be a nuclear warhead exploding over one of our own cities."
The Go developers believe that the try-catch-finally idiom obfuscates control flow, and introduced the exception-like / mechanism. differs from in that it can only be called from within a code block in a function, so the handler can only do clean-up and change the function's return values, and cannot return control to an arbitrary point within the function. The block itself functions similarly to a clause.
The Rust language does not have exceptions. It instead uses for handling runtime errors, and for serious errors the macro is used.

Works cited

Category:Control flow
Category:Software anomalies