Availability
In reliability engineering, the term availability has the following meanings:
- The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e. a random, time.
- The probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment.
Representation
The simplest representation of availability is a ratio of the expected value of the uptime of a system to the aggregate of the expected values of up and down timeAnother equation for availability is a ratio of the Mean Time To Failure and Mean Time Between Failure, or
If we define the status function as
therefore, the availability A at time t > 0 is represented by
Average availability must be defined on an interval of the real line. If we consider an arbitrary constant, then average availability is represented as
Limiting availability is represented by
Limiting average availability is also defined on an interval as,
Availability is the probability that an item will be in an operable and committable state at the start of a mission when the mission is called for at a random time, and is generally defined as uptime divided by total time.
Series vs Parallel components
Let's say a series component is composed of components A, B and C. Then following formula applies:Availability of series component = x x
Therefore, combined availability of multiple components in a series is always lower than the availability of individual components.
On the other hand, following formula applies to parallel components:
Availability of parallel components = 1 - X X
In corollary, if you have N parallel components each having X availability, then:
Availability of parallel components = 1 - ^ N
Using parallel components can exponentially increase the availability of overall system. For example if each of your hosts has only 50% availability, by using 10 of hosts in parallel, you can achieve 99.9023% availability.
Note that redundancy doesn’t always lead to higher availability. In fact, redundancy increases complexity which in turn reduces availability. According to Marc Brooker, to take advantage of redundancy, ensure that:
- You achieve a net-positive improvement in the overall availability of your system
- Your redundant components fail independently
- Your system can reliably detect healthy redundant components
- Your system can reliably scale out and scale-in redundant components.
Methods and techniques to model availability
- Reliability models
- Maintainability models
- Maintenance concepts
- Redundancy
- Common cause failure
- Diagnostics
- Level of repair
- Repair status
- Dormant failures
- Test coverage
- Active operational times / missions / sub system states
- Logistical aspects like; spare part levels at different depots, transport times, repair times at different repair lines, manpower availability and more.
- Uncertainty in parameters
Definitions within systems engineering
Availability, inherentThe probability that an item will operate satisfactorily at a given point in time when used under stated conditions in an ideal support environment. It excludes logistics time, waiting or administrative downtime, and preventive maintenance downtime. It includes corrective maintenance downtime.
Inherent availability is generally derived from analysis of an engineering design:
- The impact of a repairable-element on the availability of the system, in which it operates, equals mean time between failures MTBF/.
- The impact of a one-off/non-repairable element on the availability of the system, in which it operates, equals the mean time to failure /.
Availability, achieved
The probability that an item will operate satisfactorily at a given
point in time when used under stated conditions in an ideal support environment. It excludes logistics time and waiting or administrative downtime.
It includes active preventive and corrective maintenance downtime.
Availability, operational
The probability that an item will operate satisfactorily at a given point in time when used in an actual or realistic operating and support environment. It includes logistics time, ready time, and waiting or administrative downtime, and both preventive and corrective maintenance downtime. This value is equal to the mean time between failure divided by the mean time between failure plus the mean downtime. This measure extends the definition of availability to elements controlled by the logisticians and mission planners such as quantity and proximity of spares, tools and manpower to the hardware item.
Refer to Systems engineering for more details
Basic example
If we are using equipment which has a mean time to failure of 81.5 years and mean time to repair of 1 hour:Outage due to equipment in hours per year = 1/rate = 1/MTTF = 0.01235 hours per year.
Literature
Availability is well established in the literature of stochastic modeling and optimal maintenance. Barlow and Proschan define availability of a repairable system as "the probability that the system is operating at a specified time t." Blanchard gives a qualitative definition of availability as "a measure of the degree of a system which is in the operable and committable state at the start of mission when the mission is called for at an unknown random point in time." This definition comes from the MIL-STD-721. Lie, Hwang, and Tillman developed a complete survey along with a systematic classification of availability.Availability measures are classified by either the time interval of interest or the mechanisms for the system downtime. If the time interval of interest is the primary concern, we consider instantaneous, limiting, average, and limiting average availability. The aforementioned definitions are developed in Barlow and Proschan , Lie, Hwang, and Tillman , and Nachlas . The second primary classification for availability is contingent on the various mechanisms for downtime such as the inherent availability, achieved availability, and operational availability.. Mi gives some comparison results of availability considering inherent availability.
Availability considered in maintenance modeling can be found in Barlow and Proschan for replacement models, Fawzi and Hawkes for an R-out-of-N system with spares and repairs, Fawzi and Hawkes for a series system with replacement and repair, Iyer for imperfect repair models, Murdock for age replacement preventive maintenance models, Nachlas for preventive maintenance models, and Wang and Pham for imperfect maintenance models. A very comprehensive recent book is by Trivedi and Bobbio .