Failure Detectors

Failure detector

Failure detectors are used in distributed computing systems to detect node failures or crashes. They were first introduced in 1996 and seek to improve reliability and atomic broadcast in the system. After detecting errors, the system will ban processes that are making mistakes to prevent further serious crashes or errors.

1 courses cover this concept

CS 294-91 Distributed Computing

UC Berkeley

Winter 2013

This course provides basic theoretical and practical foundations of distributed systems. Students learn about system models, safety and liveness of protocols, different failure models, reliable group communication abstractions, and more. It utilizes a textbook and additional research paper-based lectures.

No concepts data

+ 17 more concepts