Detailed explanation of high concurrency (1)

High concurrency

High concurrency means that at the same time, there are many users accessing URL addresses at the same time, such as Taobao's double 11 and double 12, which will generate high concurrency, such as the explosion of Tieba, which is a malicious high-concurrency request, that is DDOS attack, and then the diaosi point of view is like playing Lu ah Lu and being critically hit by ADC, that hurts you understand (if you understand it, this statement means that he is a diaosi who is running to the peak of his life.

Consequences of high concurrency

  • Server side:
    The site server/DB server resources are occupied and crashed, and the results of data storage and update are different from the ideal design, such as: duplicate data records appear, user points are added many times, etc.
  • User's point of view:
    Nima, it's so stuck, I came to participate in the event, and it's still the same after refreshing it. The spam website will never come again.
  • My experience:
    In the process of building the company's product website, there are often such needs, such as what to do for a special event, a lottery, sign-in, a point auction, etc. If the data processing under high concurrency is not considered, then The game is over, it is easy to cause the lottery to be drawn too much. When you check in, you will find that a user has multiple records. If you check in once, you will get more points, etc., all kinds of phenomena beyond normal logic, this is the product website must Consider the problem, because these are for a large number of users, not just for employees like ERP management systems and OA systems.

Below I will conduct an example analysis, simple and rude, dynamic analysis, purely my personal experience sharing, if I am wrong, or have better suggestions or opinions, please leave a message, and everyone will grow together.

Data processing under concurrency:

Through table design, such as: adding unique constraints to the record table, data processing logic using things to prevent data disorder under concurrency, and
preventing data disorder under package concurrency by server-side lock process

What is mainly described here is the interface of data logic processing under concurrent requests, how to ensure the consistency and integrity of data, the concurrency here may be initiated by a large number of users, or the concurrent requests initiated by attackers through concurrent tools


For example: Prevent concurrency through table design and lead to data confusion

  • Demand point
    [Sign-in function] A user can only sign in once a day,
    and the user will get a point after the sign-in is successful
  • The user table of the known table
    contains a high score field
    and concurrency analysis (pre-development guess):
    In the case of high concurrency, there will be multiple check-in records for a user, or more than one score will be added after the user checks in.
  • In my design
    , I will first add a check-in record table according to the requirements. The key point is that this table needs to add the user's unique identification field (ID, Token) and the check-in date field as a unique constraint, or a unique index, so as to prevent When concurrent, insert the check-in record of the duplicate user. Then in the program code logic, first perform the addition of check-in data (this can prevent concurrency, and add points after the addition is successful, so as to prevent repeated addition of points. Finally, I still recommend that all data operations be written in a In the sql transaction, the data can be rolled back when the addition fails or the editing of the user points fails.

Such as example 2 (transaction + update lock to prevent concurrency causing data confusion or transaction + update lock table mechanism)

  • Demand points:
    [Lottery function] Draw one point at a time and edit the total number of remaining prizes after winning the lottery. The total number of remaining prizes is 0, or the draw cannot be performed when the user points are 0
  • Known table:
    user table, including points field prize table, including the field of the remaining number of prizes
  • High concurrency analysis (pre-development guess):
    In the case of high concurrency, points will be deducted when users participate in the lottery, and the prizes have actually been drawn
  • My design:
    In the transaction, lock the product table through WITH (UPDLOCK), or update the remaining number of prizes and the last edit time field of the Update table to lock the data row, and then consume the user points, and submit the transaction after completion. , rollback on failure. In this way, it can be guaranteed that there is only one operation that operates on the quantity of this commodity, and only after this operation transaction is submitted, other things that operate this commodity row will continue to execute.

Such as example 3 (to prevent the problem of data confusion under package concurrency through program code)

  • Requirements:
    [Cache data in the cache], when the cache does not exist, get it from the database and save it in the cache. If it exists, it must be updated once a day at 10:00, and the cache is updated for two hours at other times. When it reaches 10 o'clock, any user who opens the page will automatically refresh the page
  • Problem:
    There is a logical user triggering the update of the cache. The user refreshes the page. When the cache exists, the last cache update time will be obtained. If the current time is greater than ten o'clock, and the last cache time is before 10 o'clock, it will be Re-fetch data from the database and save it in the cache. Also, the client page will use js to refresh the page at 10 o'clock. It is because of this logic that many concurrent requests come at the same time at 10 o'clock, and then a lot of sql query operations will occur. The ideal logic is , only one request will go to the database to get it, and the others will get data from the cache. (Because this sql query consumes a lot of server performance, it leads to a sudden increase in the pressure of the database server at 10 o'clock)
  • Solve the problem:
    C# uses (lock) lock, and adds a lock in front of the code that reads from data to the cache, so that in the case of concurrency, only one request is to get data from the database, and the others are from the cache. Obtain.

Data statistics interface with a large number of visits

  • Requirement: User behavior data statistics interface, used to record the number of product impressions, the number of behaviors that users enter into product details by clicking pictures, links, or other methods
  • Problem:
    This interface is used by front-end ajax, and the traffic will be very large. When a page is displayed, dozens of products will be displayed. When the scroll bar scrolls to the page to display the products, the interface will be requested to display data. Statistics, dozens of pieces will be loaded every time the page is turned
  • Intentional obscenity analysis:
    imagine that if 1W users visit the page at the same time, and pull the scroll bar screen page to display 10 products one by one, there will be 10W requests, and the server needs to put the request data into the database. In the actual online environment, the amount of requests may exceed this amount. If the high concurrency design is not processed, the server will kneel in minutes.
  • Solve the problem:
    We wrote a data processing interface through nodejs, and stored the statistical data in the redis list first. (The advantage of using nodejs to write the interface is that nodejs uses a single-threaded asynchronous event mechanism, which has strong concurrent processing capabilities, and will not cause server downtime due to data logic processing problems that cause server resources to be occupied.) Then use nodejs to write a script, The script function is to dequeue data from redis and save it to the mysql database. This script will run all the time. When redis has no data to synchronize to the database, sleep, so that the data synchronization operation is performed.

Server pressure balance under high concurrency, reasonable site construction, DB deployment

Here's what I know:

  1. The server proxy nginx to balance the load of the server and balance the pressure to multiple servers
  2. Deploy a clustered mysql database, redis server, or mongodb server, and save some commonly used query data and data that does not change frequently to other NoSQL  DB servers to reduce the pressure on the database server and speed up data response.
  3. data cache
  4. In the design of high concurrency interfaces, programming languages ​​with high concurrency capabilities can be used to develop, such as: nodejs for web interfaces
  5. Server deployment, image server separation, static files go to CDN
  6. DBA database optimization query conditions, index optimization
  7. Message storage mechanism, add data to the information queue (redis list), and then write tools to store
  8. The script reasonably controls the request, such as preventing the user from repeatedly clicking on the ajax redundant request, and so on.

Concurrency testing artifact recommendation

  1. Apache JMeter
  2. Microsoft Web Application Stress Tool
  3. Visual Studio performance load

This series:

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

High concurrency often occurs in business scenarios with a large number of active users and high user aggregation, such as: spike activity, regular red envelopes, etc.
In order to make the business run smoothly and give users a good interactive experience, we need to design a high-concurrency processing solution suitable for our business scenario based on factors such as the estimated concurrency of the business scenario.

In the years of e-commerce-related product development, I have been fortunate enough to encounter various pitfalls. There is a lot of blood and tears along the way. The summary here will serve as my own archive record and share it with Everyone.


server architecture

From the early stage of development to the gradual maturity of business, the server architecture also changes from relatively single to cluster, and then to distributed services.
A service that can support high concurrency requires a good server architecture, which requires balanced load, a master-slave cluster for databases, a master-slave cluster for NoSQL caches, and a CDN for static files to be uploaded.

Most of the servers need the cooperation and construction of operation and maintenance personnel. I will not say more about the specifics, so far.
The server architecture that needs to be roughly used is as follows:

  • server
    • Load balance (eg: nginx, Alibaba Cloud SLB)
    • Resource monitoring
    • distributed
  • database
    • Master-slave separation, clustering
    • DBA table optimization, index optimization, etc.
    • distributed
  • nosql
    • say again
      • Master-slave separation, clustering
    • mongodb
      • Master-slave separation, clustering
    • memcache
      • Master-slave separation, clustering
  • cdn
    • html
    • css
    • js
    • image

Concurrency testing

Businesses related to high concurrency require concurrency testing, and the amount of concurrency that the entire architecture can support can be evaluated through a large amount of data analysis.

To test high concurrency, you can use a third-party server or your own test server, use test tools to test concurrent requests, and analyze the test data to obtain an assessment that can support the number of concurrency. This can be used as an early warning reference.

Third Party Services:

  • Alibaba Cloud Performance Test

Concurrency testing tools:

  • Apache JMeter
  • Visual Studio Performance Load Test
  • Microsoft Web Application Stress Tool

Practical plan

General solution

The daily user traffic is large, but it is relatively scattered, and occasionally there are high gatherings of users;

Scenario: User check-in, user center, user order, etc.
Server architecture diagram: Universal

illustrate:

These services in the scenario are basically operated by users after entering the APP. Except for event days (618, Double 11, etc.), the number of users of these services will not be highly aggregated, and the tables related to these services are all big data tables. , the business is mostly query operations, so we need to reduce the query that users directly hit the DB; query the cache first, if the cache does not exist, then perform the DB query to cache the query results.

Updating user-related caches requires distributed storage. For example, using user IDs for hash grouping and distributing users to different caches, the total amount of such a cache collection will not be large and will not affect query efficiency.

Schemes such as:

  • Users sign in to earn points
    • Calculate the key distributed by users, find the user's check-in information today in redis hash
    • If the check-in information is queried, return the check-in information
    • If there is no query, the DB queries whether you have checked in today. If you have checked in, the check-in information will be synchronized to the redis cache.
    • If today's check-in record is not queried in the DB, perform the check-in logic, operate the DB to add today's check-in record, and add check-in points (this entire DB operation is a transaction)
    • Cache the check-in information to redis, and return the check-in information
    • 注意There will be logical problems in concurrent situations, such as: checking in multiple times a day, and issuing multiple points to users.
    • My blog post [ High Concurrency in the Eyes of Big Talk Programmers ] has related solutions.
  • User order
    • Here we only cache the order information on the first page of the user, 40 pieces of data per page, and the user generally only sees the order data on the first page
    • The user accesses the order list, if it is the first page to read the cache, if it is not to read the DB
    • Calculate the key of user distribution, find user order information in redis hash
    • If the user's order information is queried, return the order information
    • If it does not exist, query the order data on the first page in DB, then cache redis and return the order information
  • User Center
    • Calculate the key of user distribution, find user order information in redis hash
    • If user information is queried, return user information
    • If there is no user DB query, then cache redis and return user information
  • Other business
    • The above examples are mostly for user storage cache. If it is public cache data, some issues need to be paid attention to, as follows
    • 注意Common cache data needs to consider the possibility of a large number of hit DB queries under concurrency. You can use the management background to update the cache, or the lock operation of DB queries.
    • My blog post [ Dahua Redis Advanced ] shares update cache problems and recommended solutions.

The above example is a relatively simple high-concurrency architecture. It can be well supported if the concurrency is not very high. However, as the business grows and the user concurrency increases, our architecture will also be continuously optimized and evolved. The business is service-oriented, each service has its own concurrent architecture, its own balanced server, distributed database, nosql master-slave cluster, such as: user service, order service;


message queue

For activities such as instant kills and instant grabs, users flood in in an instant to generate high concurrent requests

Scenario: receive red envelopes regularly, etc.
Server architecture diagram:message queue

illustrate:

The scheduled receipt in the scenario is a high-concurrency business. For example, users will flood in at the appointed time in the seckill event, and the DB will receive a critical hit in an instant. If it can't hold it, it will crash, which will affect the entire business.

For this kind of business that is not only a query operation, but also has high concurrent data insertion or update, the general solution mentioned above cannot be supported, and the DB is directly hit when it is concurrent;

When designing this business, a message queue will be used. You can add the information of participating users to the message queue, and then write a multi-threaded program to consume the queue and issue red envelopes to users in the queue;

Schemes such as:

  • Receive red envelopes regularly
    • Generally used to use redis list
    • When the user participates in the activity, push the user participation information to the queue
    • Then write a multi-threaded program to pop data and carry out the business of issuing red envelopes
    • In this way, users with high concurrency can participate in activities normally and avoid the danger of database server downtime.

Additional:
A lot of services can be done through message queues.
Such as: timing SMS sending service, use sset (sorted set), send timestamp as the sorting basis, SMS data queue is in ascending order according to time, and then write a program to read the first item in the sset queue periodically, whether the current time exceeds the sending time If the time is exceeded, the SMS will be sent.


L1 cache

The high concurrent request connection cache server exceeds the number of request connections the server can receive, and some users have the problem that the connection timeout cannot be read;

Therefore, there is a need for a solution that can reduce hits to the cache server when there is high concurrency;

At this time, there is a first-level cache solution. The first-level cache is to use the site server cache to store data. Note that only part of the data with a large amount of requests is stored, and the amount of cached data should be controlled, and the memory of the site server should not be used excessively. It affects the normal operation of the site application. The first-level cache needs to set the expiration time in seconds. The specific time is set according to the business scenario. The purpose is to make the data acquisition hit the first-level cache when there are high concurrent requests. Connect to the cached nosql data server to reduce the pressure on the nosql data server

For example, the APP first screen commodity data interface, these data are public and will not be customized for users, and these data will not be updated frequently, such as the large amount of requests of this interface, it can be added to the first-level cache;

Server Architecture Diagram:Universal

Reasonable specification and use of the nosql cache database, splitting the cache database cluster according to the business, this can basically support the business well. After all, the first-level cache uses the site server cache, so it should be used well.


static data

If the high concurrent request data does not change, if you can not request your own server to obtain data, it can reduce the resource pressure of the server.

If the update frequency is not high and the data can be delayed for a short time, the data can be statically converted into JSON, XML, HTML and other data files and uploaded to the CDN. When pulling data, the CDN is given priority to pull. Obtained from the cache and database, when the administrator operates the background to edit the data and then regenerate the static file and upload it to the CDN, so that the data can be obtained on the CDN server during high concurrency.

There is a certain delay in the synchronization of CDN nodes, so it is also very important to find a reliable CDN server provider


Other options

  • For data that is not frequently updated, APP and PC browsers can cache the data locally, and then upload the version number of the current cached data each time the interface is requested. The server receives the version number to determine the version number and the latest data version number. Whether they are consistent, if they are different, query the latest data and return the latest data and the latest version number. If they are the same, return a status code to inform that the data is up-to-date.减少服务器压力:资源、带宽

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325978096&siteId=291194637