Failure detectors are used in distributed computing systems to detect node failures or crashes. They were first introduced in 1996 and seek to improve reliability and atomic broadcast in the system. After detecting errors, the system will ban processes that are making mistakes to prevent further serious crashes or errors.
UC Berkeley
Winter 2013
This course provides basic theoretical and practical foundations of distributed systems. Students learn about system models, safety and liveness of protocols, different failure models, reliable group communication abstractions, and more. It utilizes a textbook and additional research paper-based lectures.
No concepts data
+ 17 more concepts