Debugging

In engineering, debugging is the process of finding the root cause, workarounds, and possible fixes for bugs.
For software, debugging tactics can involve interactive debugging, control flow analysis, log file analysis, monitoring at the application or system level, memory dumps, and profiling. Many programming languages and software development tools also offer programs to aid in debugging, known as debuggers.

Etymology

The term bug, in the sense of defect, dates back at least to 1878 when Thomas Edison wrote "little faults and difficulties" in his inventions as "Bugs".
A popular story from the 1940s is from Admiral Grace Hopper. While she was working on a Mark II computer at Harvard University, her associates discovered a moth stuck in a relay that impeded operation and wrote in a log book "First actual case of a bug being found". Although probably a joke, conflating the two meanings of bug, the story indicates that the term was used in the computer field at that time.
Similarly, the term debugging was used in aeronautics before entering the world of computers. A letter from J. Robert Oppenheimer, director of the WWII atomic bomb Manhattan Project at Los Alamos, used the term in a letter to Dr. Ernest Lawrence at UC Berkeley, dated October 27, 1944, regarding the recruitment of additional technical staff.
The Oxford English Dictionary entry for debug uses the term debugging in reference to airplane engine testing in a 1945 article in the Journal of the Royal Aeronautical Society.
An article in "Airforce" refers to debugging aircraft cameras.
The seminal article by Gill in 1951 is the earliest in-depth discussion of programming errors, but it does not use the term bug or debugging.
In the ACM's digital library, the term debugging is first used in three papers from the 1952 ACM National Meetings. Two of the three use the term in quotation marks.
By 1963, debugging was a common enough term to be mentioned in passing without explanation on page 1 of the CTSS manual.

Scope

As software and electronic systems have become generally more complex, the various common debugging techniques have expanded with more methods to detect anomalies, assess impact, and schedule software patches or full updates to a system. The words "anomaly" and "discrepancy" can be used, as being more neutral terms, to avoid the words "error" and "defect" or "bug" where there might be an implication that all so-called errors, defects or bugs must be fixed. Instead, an impact assessment can be made to determine if changes to remove an anomaly would be cost-effective for the system, or perhaps a scheduled new release might render the unnecessary. Not all issues are safety-critical or mission-critical in a system. Also, it is important to avoid the situation where a change might be more upsetting to users, long-term, than living with the known . Basing decisions of the acceptability of some anomalies can avoid a culture of a "zero-defects" mandate, where people might be tempted to deny the existence of problems so that the result would appear as zero defects. Considering the collateral issues, such as the cost-versus-benefit impact assessment, then broader debugging techniques will expand to determine the frequency of anomalies to help assess their impact to the overall system.

Tools

Debugging ranges in complexity from fixing simple errors to performing lengthy and tiresome tasks of data collection, analysis, and scheduling updates. The debugging skill of the programmer can be a major factor in the ability to debug a problem, but the difficulty of software debugging varies greatly with the complexity of the system, and also depends, to some extent, on the programming language used and the available tools, such as debuggers. Debuggers are software tools which enable the programmer to monitor the execution of a program, stop it, restart it, set breakpoints, and change values in memory. The term debugger can also refer to the person who is doing the debugging.
Generally, high-level programming languages, such as Java, make debugging easier, because they have features such as exception handling and type checking that make real sources of erratic behaviour easier to spot. In programming languages such as C or assembly, bugs may cause silent problems such as memory corruption, and it is often difficult to see where the initial problem happened. In those cases, memory debugger tools may be needed.
In certain situations, general purpose software tools that are language specific in nature can be very useful. These take the form of static code analysis tools. These tools look for a very specific set of known problems, some common and some rare, within the source code, concentrating more on the semantics rather than the syntax, as compilers and interpreters do.
Both commercial and free tools exist for various languages; some claim to be able to detect hundreds of different problems. These tools can be extremely useful when checking very large source trees, where it is impractical to do code walk-throughs. A typical example of a problem detected would be a variable dereference that occurs before the variable is assigned a value. As another example, some such tools perform strong type checking when the language does not require it. Thus, they are better at locating likely errors in code that is syntactically correct. But these tools have a reputation of false positives, where correct code is flagged as dubious. The old Unix lint program is an early example.
For debugging electronic hardware as well as low-level software and firmware, instruments such as oscilloscopes, logic analyzers, or in-circuit emulators are often used, alone or in combination. An ICE may perform many of the typical software debugger's tasks on low-level software and firmware.

Debugging process

The debugging process normally begins with identifying the steps to reproduce the problem. This can be a non-trivial task, particularly with parallel processes and some Heisenbugs for example. The specific user environment and usage history can also make it difficult to reproduce the problem.
After the bug is reproduced, the input of the program may need to be simplified to make it easier to debug. For example, a bug in a compiler can make it crash when parsing a large source file. However, after simplification of the test case, only few lines from the original source file can be sufficient to reproduce the same crash. Simplification may be done manually using a divide-and-conquer approach, in which the programmer attempts to remove some parts of original test case then checks if the problem still occurs. When debugging in a GUI, the programmer can try skipping some user interaction from the original problem description to check if the remaining actions are sufficient for causing the bug to occur.
After the test case is sufficiently simplified, a programmer can use a debugger tool to examine program states and track down the origin of the. Alternatively, tracing can be used. In simple cases, tracing is just a few print statements which output the values of variables at particular points during the execution of the program.

Techniques

Interactive debugging uses debugger tools which allow a program's execution to be processed one step at a time and to be paused to inspect or alter its state. Subroutines or function calls may typically be executed at full speed and paused again upon return to their caller, or themselves single stepped, or any mixture of these options. Setpoints may be installed which permit full speed execution of code that is not suspected to be faulty, and then stop at a point that is. Putting a setpoint immediately after the end of a program loop is a convenient way to evaluate repeating code. Watchpoints are commonly available, where execution can proceed until a particular variable changes, and catchpoints which cause the debugger to stop for certain kinds of program events, such as exceptions or the loading of a shared library.
' or tracing is the act of watching trace statements, or print statements, that indicate the flow of execution of a process and the data progression. Tracing can be done with specialized tools or by insertion of trace statements into the source code. The latter is sometimes called ', due to the use of the printf function in C. This kind of debugging was turned on by the command TRON in the original versions of the novice-oriented BASIC programming language. TRON stood for, "Trace On." TRON caused the line numbers of each BASIC command line to print as the program ran.
Activity tracing is like tracing, but rather than following program execution one instruction or function at a time, follows program activity based on the overall amount of time spent by the processor/CPU executing particular segments of code. This is typically presented as a fraction of the program's execution time spent processing instructions within defined memory addresses or certain program modules. If the program being debugged is shown to be spending an inordinate fraction of its execution time within traced areas, this could indicate misallocation of processor time caused by faulty program logic, or at least inefficient allocation of processor time that could benefit from optimization efforts.
is the process of debugging a program running on a system different from the debugger. To start remote debugging, a debugger connects to a remote system over a communications link such as a local area network. The debugger can then control the execution of the program on the remote system and retrieve information about its state.
Post-mortem debugging is debugging of the program after it has already crashed. Related techniques often include various tracing techniques like examining log files, outputting a call stack on the crash, and analysis of memory dump of the crashed process. The dump of the process could be obtained automatically by the system, or by a programmer-inserted instruction, or manually by the interactive user.
"Wolf fence" algorithm: Edward Gauss described this simple but very useful and now famous algorithm in a 1982 article for Communications of the ACM as follows: "There's one wolf in Alaska; how do you find it? First build a fence down the middle of the state, wait for the wolf to howl, determine which side of the fence it is on. Repeat process on that side only, until you get to the point where you can see the wolf." This is implemented e.g. in the Git version control system as the command git bisect, which uses the above algorithm to determine which commit introduced a particular bug.
Record and replay debugging is the technique of creating a program execution recording, which can be replayed and interactively debugged. Useful for remote debugging and debugging intermittent, non-deterministic, and other hard-to-reproduce defects.
Time travel debugging is the process of stepping back in time through source code to understand what is happening during execution of a computer program; to allow users to interact with the program; to change the history if desired and to watch how the program responds.
Delta debugging a technique of automating test case simplification.
Saff Squeeze a technique of isolating failure within the test using progressive inlining of parts of the failing test.
Causality tracking: There are techniques to track the cause effect chains in the computation. Those techniques can be tailored for specific bugs, such as null pointer dereferences.