Cassandra multiple data centers to achieve remote disaster recovery

Cassandra is a cluster deployment, multiple nodes, multiple data backups, one or two nodes hang up, and generally there will be no data loss. As long as the failed node is deleted and other nodes are repaired, the data will be automatically balanced to the complete number of copies.

However, if a large area of ​​nodes is powered off, or the computer room catches fire, data will definitely be lost. The business of using cassandra as data storage must be a very large business with a large amount of data. Disaster recovery in the computer room is definitely essential.

Cassandra offers a variety of multi-datacenter deployments, rack-sensitive strategies. Here is the most common one: GossipingPropertyFileSnitch

The GossipingPropertyFileSnitch strategy supports simple multiple data centers, and multiple racks.

first step:

Specify that the cluster supports this policy in the cassandra.yaml configuration file:

endpoint_snitch: GossipingPropertyFileSnitch

Step 2:

Match the seed nodes of both data centers in the seeds of cassandra.yaml

– seeds: “192.168.22.101,192.168.22.102,192.168.23.101,192.168.23.102”

third step:

Configure cassandra-rackdc.properties Each machine configures its own data center name and rack name

dc=DC1
rack=RAC1

The purpose of configuring a rack is to prevent the entire rack from being powered off and data lost. Cassandra can try to ensure that multiple copies of the same data do not exist on the same rack.

This requires that the number of your racks should be approximately equal to the number of your data copies, and the number of nodes in each rack should be the same as possible, otherwise it will lead to too much data on some nodes and uneven distribution.

the fourth step

Create keyspace using the NetworkTopologyStrategy strategy and specify the number of copies per cluster.

CREATE KEYSPACE mykeyspace WITH replication = {
‘class’: ‘NetworkTopologyStrategy’,
‘DC1’: ‘3’,
‘DC2’: ‘3’
};

第五步

客户端使用数据一致性策略,从QUORUM改为LOCAL_QUORUM。这样客户端会先从LOCAL数据中查询,LOCAL无法查询,再从REMOTE数据中心进行查询。

cassandra JAVA官方驱动,把默认首先连上的节点所属的数据中心视为LOCAL数据中心。所以你不要容灾数据中心节点IP配到了代码中。

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326849419&siteId=291194637