Software safety
Software safety is an engineering discipline that aims to ensure that software, which is used in safety-related systems, does not contribute to any hazards such a system might pose.
There are numerous standards that govern the way how safety-related software should be developed and assured in various domains. Most of them classify software according to their criticality and propose techniques and measures that should be employed during the development and assurance:
- Software for generic electronic safety-related systems: IEC 61508
- Automotive software: ISO 26262
- Railway software: EN 50716
- Airborne software: DO-178C/ED-12C)
- Air traffic management software: DO-278A/ED-109A
- Medical devices: IEC 62304
- Nuclear power plants: IEC 60880
Terminology
The goal of software safety is to make sure that software does not cause or contribute to any hazards in the system where it is used and that it can be assured and demonstrated that this is the case. This is typically achieved by the assignment of a "safety level" to the software and the selection of appropriate processes for the development and assurance of the software.
Assignment of safety levels
One of the first steps when creating safety-related software is to classify software according to its safety-criticality. Various standards suggest different levels, e.g. Software Levels A-E in DO-178C, SIL 1-4 in IEC 61508, ASIL A-D in ISO 26262.The assignment is typically done in the context of an overarching system, where the worst case consequences of software failures are investigated. For example, automotive standard ISO 26262 requires the performance of a Hazard and Risk Assessment on vehicle level to derive the ASIL of the software executed on a component.
Process adherence and assurance
It is essential to use an adequate development and assurance process, with appropriate methods and techniques, commensurate with the safety criticality of the software. Software safety standards recommend and sometimes forbid the use of such methods and techniques, depending on the safety level.Most standards suggest a lifecycle model and prescribe required activities to be executed during the various phases of the software. For example, IEC 61508 requires that software is specified adequately, that the software design should be modular and testable, that adequate programming languages are used, documented code reviews are performed and that testing should be performed an several layers to achieve an adequately high test coverage.
The focus on the software development and assurance process stems from the fact that software quality is heavily influenced by the software process, as suggested by IEC 25010. It is claimed that the process influences the internal software quality attributes and these in turn influence external software quality attributes.
The following activities and topics addressed in the development process contribute to safe software.
Documentation
Comprehensive documentation of the complete development and assurance process is required by virtually all software safety standards. Typically, this documentation is reviewed and endorsed by third parties and therefore a prerequisite for the approval of safety-related software. The documentation ranges from various planning documents, requirements specifications, software architecture and design documentation, test cases on various abstraction levels, tool qualification reports, review evidence, verification and validation results etc. Fig C.2 in EN 50716 lists 32 documents that need to be created along the development lifecycle.Traceability
is the practice to establish relationships between different types of requirements and between requirements and design, implementation and testing artefacts. According to EN 50716, the objective “is to ensure that all requirements can be shown to have been properly met and that no untraceable material has been introduced”. By documenting and maintaining traceability, it becomes possible to follow e.g. a safety requirement into the design of a system, further on into the software source code, and to an appropriate test case and test execution.Software implementation
Safety standards can have requirements directly affecting the implementation of the software in source code, such as e.g. the selection of an appropriate programming language, the size and complexity of functions, the use of certain programming constructs and the need for coding standards. Part 3 of IEC 61508 contains the following requirements and recommendations:- Use of a strongly typed programming language. Some languages are better suited than others for safety-related systems. Languages that support strong typing can detect more faults during the compilation process that would otherwise only be detected during runtime. Therefore, assembler is typically discouraged, whereas high level languages especially geared towards for the safety-related market are recommended.
- Use of an appropriate coding standard defining a “safe” language subset, e.g. MISRA C. MISRA-C is a coding standard for the C programming language that aims to improve code quality and safety by disallowing error prone constructs, or features that are compiler dependent.
- Limiting the use of recursion, pointers and interrupts.
- Disallowing “unstructured control flow in programs”, i.e. avoiding jumping in an unstructured way, e.g. by using “goto”-like statements.
Test coverage
- Level C: Statement coverage is required - i.e. "every statement in the program has been invoked at least once" during testing.
- Level B: Branch coverage is required - i.e. "every point of entry and exit in the program has been invoked at least once and every decision in the program has taken on all possible outcomes at least once."
- Level A: Modified condition/decision coverage - an extension of branch coverage, with the requirement that "each condition in a decision has been shown to independently affect that decision's outcome."
Independence
For example, EN 50716 Figure 2 requires the roles “implementer”, “tester” and “verifier” to be held by different people, the role “validator” to be held by a person with different reporting line and the role “assessor” to be held by a person from a different organizational unit.
DO-178C and DO-278A require several activities to be executed “with independence”, with independence being defined as “separation of responsibilities which ensures the accomplishment of objective evaluation”.
Open questions and issues
Software failure rates
In system safety engineering, it is common to allocate upper bounds for failure rates of subsystems or components. It must then be shown that these subsystems or components do not exceed their allocated failure rates, or otherwise redundancy or other fault tolerance mechanisms must be employed. This approach is not practicable for software.Software failure rates cannot be predicted with any confidence. Although significant research in the field of software reliability has been conducted, current software safety standards do not require any of these methods to be used or even discourage their usage, e.g. DO178C states: “Many methods for predicting software reliability based on developmental metrics have been published, for example, software structure, defect detection rate, etc. This document does not provide guidance for those types of methods, because at the time of writing, currently available methods did not provide results in which confidence can be placed.” ARP 4761 clause 4.1.2 states that software design errors “are not the same as hardware failures. Unlike hardware failures, probabilities of such errors cannot be quantified.”