The core process is half way through, and the server crashes, how to deal with it

There are three problems here at the same time:

1. Troubleshooting and fast recovery
2. Abnormal data repair
3. High service availability to avoid service downtime

 

Get the business first

  When it is found that the server is down, the most important thing is to grab the business, not to repair the server. Therefore, an emergency plan is needed. It is best to prepare two web servers, they store the same content, but different IP, and the geographical location of the computer room is different. In this way, after the downtime problem is discovered for the first time, the domain name record can be repaired quickly to point to the current normal website space. Moreover, the possibility of downtime for two hosts at the same time is greatly reduced.

 

Server crash problem location

1. Memory overflow, exhaustion of disk resources
2. Thread deadlock, too many processes or continuous creation, exhaustion of resources leads to 
3. Slow database query, too many connections, insufficient temporary tables, program deadlock
4. Main and standby data Inconsistent
5. Abnormal application 
6. Excessive traffic load 7.
DOSS attack
8. Heat dissipation problem

 

Abnormal data repair

1. Write data for transaction control to ensure data security.
2. Disk backup, restore data when restarting the service.
3. Record key logs.

 

High service availability

1. Service multi-instance cluster deployment, load balancing strategy access, do a good job of service degradation and service current limit.
2. The database read-write separation, database and table sub-scheme.
3. Do a good job of service performance testing and stress testing. (How to avoid the risk of server downtime: https://wetest.qq.com/lab/view/310.html?from=content_SegmentFault )

Guess you like

Origin blog.csdn.net/Anenan/article/details/114263726