FrontPage
»
ReplicationStrategies
»
QuorumBasedReplication
Assumption
We want to have multiple, cooperating ZEOStorageServer?(s), with clients able to initiate writes at each one.
Protocol
- The ZEOClientApplication? begins a transaction with its
ZEOClientStorage?, calling
tpc_begin
. We will refer to the ZEOStorageServer? invoked here as the ReplicationCoordinator, and to the other ZEOStorageServer?(s) as ReplicationPeer(s). - The ZEOClientApplication? then issues multiple
store
messages to its storage (one for each object modified by the transaction). For eachstore
, the ReplicationCoordinator solicits a vote from each ReplicationPeer, passing the server ID, the object ID, and the new GenerationNumber. - Each ReplicationPeer votes, replying:
- "Abort" if the new GenerationNumber <= its own CommittedGenerationNumber.
- "Yes" if it has no current TentativeGenerationNumber and if the
new GenerationNumber > its own CommittedGenerationNumber.
- the peer copies the supplied GenerationNumber as its own TentativeGenerationNumber)
- "Yes" if it has a current TentativeGenerationNumber and if the
new GenerationNumber > its own TentativeGenerationNumber.
- the peer copies the supplied GenerationNumber as its own TentativeGenerationNumber)
- "No" if it has a current TentativeGenerationNumber and if the new GenerationNumber <= its own TentativeGenerationNumber.
- The ReplicationCoordinator tallies these votes:
- Upon receiving any "Abort" reply:
- broadcasts an "Abort Tentative" message to all peers, supplying its server ID, the object ID, and the now-cancelled TentativeGenerationNumber;
- clears its own TentativeGenerationNumber;
- enqueues a read request for that object from the replying peer;
- raises a ConflictError?.
- Upon failing to achieve a majority, either through the receipt of
explicit "No" votes or through timeout:
- broadcasts an "Abort Tentative" message to all peers, supplying its server ID, the object ID, and the now-cancelled TentativeGenerationNumber;
- clears its own TentativeGenerationNumber;
- raises a ConflictError?.
- Upon achieving a majority of "Yes" votes, returns normally to the client.
- Upon receiving any "Abort" reply:
- After succesfully completing all
store
requests, the client invokestpc_vote
on the ReplicationCoordinator, which then marks all stored objects as committed (i.e., TentativeGenerationNumber -> CommittedGenerationNumber) and enqueues updates for each object.- Alternate scenario
- the client chooses to abort the transaction, invoking 'tpc_abort'; the ReplicationCoordinator then broadcasts "Abort Tentative" messages for each object, clearing each TentativeGenerationNumber.
- As each ReplicationPeer receives "Update Object" messages, it:
- updates its CommittedGenerationNumber, and the object itself, IFF the new GenerationNumber > its existing CommittedGenerationNumber.
- clears its TentativeGenerationNumber, IFF the new GenerationNumber > its TentativeGenerationNumber.