Fault tolerance: A means to provide reliable computing system

Lutful Karim, Mohammad Shorfuzzaman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Fault tolerant computer architecture is essential to achieve high reliability in today's computing systems. This paper focuses on implementations of fault tolerance in different computing systems. The focused areas are disk arrays, routing, group communications, and fail stop processors. The paper starts by describing different types and phases of fault tolerance and also by stating redundancy - an important issue of fault tolerance. The ideas described relate to techniques used in different systems to achieve reliable and fault free operations. Thus, the main goal of the paper is to present what is currently known about a fault tolerant computing system and its recent advancements and to provide an impetus for research in new fault tolerant architectures.

Original languageEnglish
Title of host publicationWMSCI 2005 - The 9th World Multi-Conference on Systemics, Cybernetics and Informatics, Proceedings
Pages35-40
Number of pages6
StatePublished - 2005
Externally publishedYes
Event9th World Multi-Conference on Systemics, Cybernetics and Informatics, WMSCI 2005 - Orlando, FL, United States
Duration: 10 Jul 200513 Jul 2005

Publication series

NameWMSCI 2005 - The 9th World Multi-Conference on Systemics, Cybernetics and Informatics, Proceedings
Volume4

Conference

Conference9th World Multi-Conference on Systemics, Cybernetics and Informatics, WMSCI 2005
Country/TerritoryUnited States
CityOrlando, FL
Period10/07/0513/07/05

Keywords

  • Compressionless
  • Error recovery
  • Fault tolerance
  • Pre-fetching
  • Redundancy
  • Routing

Fingerprint

Dive into the research topics of 'Fault tolerance: A means to provide reliable computing system'. Together they form a unique fingerprint.

Cite this