Programmers way through - from the 6000 request per second write talk

image

background

Each of the film's scenes, have retained your viewing history, detailed remember you watch a few times, skip those long, reportedly based on these data can be analyzed do you like Japanese, in order to do directional push. .....

Although it looks very simple function, in fact, the amount of data involved is very large, for your users * product of the number of video under extreme circumstances.

Then in only two web servers, one case of sqlserver, how to deal with such a write request is not a big amount of data? Why is a write request it? Because every second user to watch the video you need to record, for example: the tenth seconds of video users watched. To get this feature, you first need to define a few things:

  1. Watch video case record user data definition
  2. And data protocol client interaction
  3. Recorded in a data format database
  4. How to solve the server writes pressure (after all, a single server request number is quite large)

solution

Watch video user-defined schedule

For a video, the length of time if there is one hour, which corresponds to 3600 seconds 3600 whether the state of viewing, for the viewing state, only two kinds of viewing and non-viewing state, a sufficient bit, a byte (byte) has 8 bit, so a byte can represent the state of viewing 8 seconds, on this basis, the higher the band, the more status the same number of characters represented.

Each time the client upload new data needs and data services already exists as do the bit operation, for example: 01000 seconds represents the second watch, upload the new client: 00011 represents 4,5 seconds are watched for users Introduction the first video, 4, 5 seconds have seen, although only a simple operation, but a large amount of time, cpu consumption should not be underestimated.

第一字节    第二字节
  0 1 2 3 4 5 6 7  0 1 2 3 4 5 6 7 
bit:  1 0 0 0 1 0 0 0  0 1 0 0 0 0 0 0
二进制:  0x88    0x40
字符串:  8840
And client interaction protocol

Users to watch real-time information on the progress of the video, only the client knows that the client needs to be uploaded to watch the progress of the user's data, and binary interaction server can be selected relatively strong universal hexadecimal, of course, you choose 100 hex does not matter, As long as both sides can support, and be able to properly resolve

Database data format

There are differences in each database supported data types, so there is not too much narrative, of course, no matter what the format, take up space better, but it should also be taken into account is calculated according to the amount of business.

Solve the problem

cpu performance issues

After all, every time the user should watch the latest data and old data to make the merger work, in the case of large user should not be underestimated. After the combination of various conditions, the final combined work to do a 10 decimal, 16 hex uploaded onto the client data, and then converted to decimal, and then the viewing history (10 decimal) to make combined operation, this part can not be omitted cpu specific procedures for the conversion:

 //需要新加的数据
        ConcurrentQueue<UserVideoInfo> AddQueue = new ConcurrentQueue<UserVideoInfo>();

//把16进制的字符串按照两位 分割成十进制数组
        protected List<int> ConvertToProgressArray(string progressString)
        {
            if (string.IsNullOrWhiteSpace(progressString))
            {
                return null;
            }
            //验证是否为2的倍数长度
            if (progressString.Length % 2 != 0)
            {
                return null;
            }
            var proStrSpan = progressString.AsSpan();
            List<int> ret = new List<int>();
          
            int i = 0;
            while (i < proStrSpan.Length)
            {
                ret.Add(int.Parse(proStrSpan.Slice(i, 2).ToString(), System.Globalization.NumberStyles.HexNumber)); ;
                i = i + 2;
            }
            return ret;
        }
The client requests the number of questions

If ten thousand users simultaneously while viewing the video, upload the data interval is 2 seconds, which means there are 5,000 requests per second. Since this is just a business user log-based business, what is the log type, that can tolerate some data loss, data for the form, the client can do first recorded in the local buffer, there is no need to upload one second time records, such as now agreed client 30 seconds to upload a record, if the user turns off the client, when the next start will not re-upload successful track record.

Pressure Database

If the database is updated on each request individually, according to the calculated second article update request up to 5,000 times per second. Each time the user watches the video are loaded in the cache memory, a careful analysis of this business, as is the type of log data, so every time you have requested is not necessary to update the database, but the first update of the cache, and then timed to update the database.

Since the problem of data volume, all the update is sent to a job queue, the queue executor will batch updates the database based on configuration so than a single update database performance is much higher, in fact, such programs are in many log type of business there use, batch updates to the database a lot of pressure to be small, similar to the following codes

public async Task<int> AddUserVideoData(UserVideoInfo data, DBProcessEnum processType = DBProcessEnum.Update)
        {
            if(processType== DBProcessEnum.Add)
            {
                AddQueue.Enqueue(data);
            }
           
            return 1;
        }

 void MulProcessData()
        {
            //每次更新的条数
            int maxNumber = 50;
            List<UserVideoInfo> data = new List<UserVideoInfo>();
            while (true)
            {
                if (data == null)
                {
                    data = new List<UserVideoInfo>();
                }
                try
                {                   
                    if (!AddQueue.Any() && !UpdateQueue.Any())
                    {
                        System.Threading.Thread.Sleep(500);
                    }                   
                    else
                    {
                        //先处理 需要更新的
                        data.Clear();
                        while (data.Count <= maxNumber && AddQueue.Any())
                        {
                            if (!AddQueue.TryDequeue(out UserVideoInfo value))
                            {                                
                                continue;
                            }
                            //判断是否有重复对象
                            if (data.Any(s => s.UserId == value.UserId && s.VideoId == value.VideoId))
                            {
                                var exsitItem = data.First(s => s.UserId == value.UserId && s.VideoId == value.VideoId);
                                exsitItem = value;
                            }
                            else
                            {
                                data.Add(value);
                            }

                        }
                        if (data != null && data.Any())
                        {
                            var ret = UserVideoProgressProxy.Add(data);
                        }
                        
                    }
                }
                catch (Exception err)
                {
                    
                }


            }
        }

Written in the last

In fact, this high IO operations but not good with this sqlserver relational database, Nosql much higher IO in this simple situation, can be changed to redis try another day, is estimated to be much better than sqlserver.

Guess you like

Origin www.cnblogs.com/zhanlang/p/12446675.html