Stratus VOS - Fault Tolerance

Fault Tolerance

Fault tolerance is built into VOS from the bottom up. On a hardware level, major devices are run in lock stepped duplex mode, meaning that there are two identical devices performing the same action at the same time. (In addition, each device, or board, is also duplexed in order to identify internal board failures at a hardware level, which is why Stratus hardware can be defined as "lock stepped".) These boards are actively monitored by the operating system which can correct any minor inconsistencies (such as bad disk-writes or reads). Any boards which report an unacceptable number of faults are removed from service by the system; the duplexed board will continue operation until the problem is resolved via a hot-fix. This includes CPUs, disk drives, and any other device that can logically be duplexed (which by definition, excludes communications devices). The system will continue processing as normal and will automatically raise a fault ticket with Stratus Customer Service via RSN (the Remote Service Network). Stratus Customer Service will then dial into the system using RSN to investigate the problem and dispatch replacement parts.

Typically a modern Stratus ftServer or Continuum can expect to see 99.999% uptime. This is not to say that the applications running on these systems will achieve this level of uptime, only that the operating system will not crash due to a simplexed hardware failure.

Read more about this topic:  Stratus VOS

Famous quotes containing the words fault and/or tolerance:

    If another member of the church sins against you, go and point out the fault when the two of you are alone. If the member listens to you, you have regained that one.
    Bible: New Testament, Matthew 18:15.

    Children who begin life with an eagerness to please, need to know that not pleasing is also all right now and then. They learn tolerance for others’ faults through our tolerance of their own.
    Cathy Rindner Tempelsman (20th century)