Multiple data centers or Zones,allow customers to take advantage of the fault isolation provided by separate physical data centers to improve the availability of a store because each Zone has a copy of your complete store, including a copy of all the shards. With this settings, when a zone fails, the ability to write is automatically established as normal master election can be held or quorum is held. However, if quorum is lost because of either total zone failure or planned shutdown, the Failover and Switchover feature introduced can be used.
A Failover is typically performed when the primary zone fails or has become unreachable and one of the secondary zones is transitioned to take over the primary role.
Switchovers can be used after performing a failover (to restore the original configuration) or for planned maintenance. It can be thought of a role reversal between a primary zone and one of the secondary zones of the store. Switchover requires quorum and guarantees no data loss. It is typically done for planned maintenance of the primary system.
For example, suppose a store consists of two primary zones "Manhattan" and "JerseyCity", each deployed in its own physical data center. Additionally, suppose that the "Manhattan" zone fails. Resulting in the failure of all of the associated Storage Nodes and a loss of quorum. In this case, if the host hardware of "Manhattan" was irreparably damaged or the problem will take too long to repair you may choose to initiate a failover. Refer to our admin guide to learn how to perform failover to JerseyCity datacenter and once the fault is recovered how to switchover to Manhattan data center.
Failover interfaces :
Diagnose failure: ping or verify configuration
Disable failed zones: disable-services
Repair admin: repair-admin-quorum
Failover to remaining zones: plan failover
Switchover Interfaces :
Repair topology (after failover): plan repair-topology
Wait for consistency (optional): await-consistency