Chandy–Lamport algorithm
The Chandy–Lamport algorithm is a snapshot algorithm that is used in distributed systems for recording a consistent global state of an asynchronous system. It was developed by and named after Leslie Lamport and K. Mani Chandy.
History
According to Leslie Lamport's website, the snapshot algorithm was described when he visited Chandy, who was at the University of Texas (Austin). Chandy posed the problem over dinner, but they had had too much wine to think about it. The next morning, while Lamport was in the shower, he came up with the solution. When he arrived at Chandy's office, he was waiting for him with the same solution. Lamport considers the algorithm to be a straightforward application of the basic ideas in his article Time, Clocks and the Ordering of Events in a Distributed System.Definition
The assumptions of the algorithm are as follows:- There are no failures and all messages arrive intact and only once
- The communication channels are unidirectional and FIFO ordered
- There is a communication path between any two processes in the system
- Any process may initiate the snapshot algorithm
- The snapshot algorithm does not interfere with the normal execution of the processes
- Each process in the system records its local state and the state of its incoming channels
Some of the assumptions of the algorithm can be facilitated using a more reliable communication protocol such as TCP/IP. The algorithm can be adapted so that there could be multiple snapshots occurring simultaneously.
Algorithm
The Chandy–Lamport algorithm works like this:- The observer process :
- # Saves its own local state
- # Sends a snapshot request message bearing a snapshot token to all other processes
- A process receiving the snapshot token for the first time on any message:
- # Sends the observer process its own saved state
- # Attaches the snapshot token to all subsequent messages
- When a process that has already received the snapshot token receives a message that does not bear the snapshot token, this process will forward that message to the observer process. This message was obviously sent before the snapshot “cut off” and needs to be included in the snapshot.