Fault Tolerance Essay

479 Words2 Pages
DATA STRUCTURES What is Fault-tolerant system Fault-tolerance or graceful degradation is the property that enables a system (often computer-based) to continue operating properly in the event of the failure of (or one or more faults within) some of its components. Fault-tolerance is particularly sought-after in high-availability or life-critical systems. Fault tolerance requirements: The basic characteristics of fault tolerance require: • No single point of repair • Fault isolation to the failing component • Fault containment to prevent propagation of the failure • Availability of reversion modes Fault-tolerance by replication Spare components address the first fundamental characteristic of fault-tolerance in three ways: • Replication: Providing multiple identical instances of the same system or subsystem, directing tasks or requests to all of them in parallel, and choosing the correct result on the basis of a quorum; • Redundancy: Providing multiple identical instances of the same system and switching to one of the remaining instances in case of a failure. • Diversity: Providing multiple different implementations of the same specification, and using them like replicated systems to cope with errors in a specific implementation. Fault-tolerant design In engineering, fault-tolerant design, also known as fail-safe design, is a design that enables a system to continue operation, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails. When to use Providing fault-tolerant design for every component is normally not an option. In such cases the following criteria may be used to determine which components should be fault-tolerant: • How critical is the component? In a car, the radio is not critical, so this component has less need for fault-tolerance. • How likely is the
Open Document