Summary of Weibo Technology Architecture

Sina Weibo:

User split data split asynchronous processing interface monitoring multi-room remote distribution real-time push platform security           

The first version is LAMP architecture, the advantage is that our system can be implemented very quickly. The product of Weibo is analyzed from the perspective of architecture. What it needs to solve is the problem of publishing and subscribing. Our first version adopts the push message mode. If we say that one of our star users has 100,000 followers, that means when a user posts a Weibo, we save this Weibo message as 100,000 copies.

The technical details of the first version, the typical LAMP architecture, is to use the MyISAM search engine, which has the advantage of being very fast.

Another is MPSS , that is, multiple ports can be arranged on the same server. Why use MPSS ? If we make an Internet application, there are three units in this application, and we can deploy it in two ways. We can deploy the three units on three servers respectively. Another deployment mode is that these three units are deployed on each server. I recommend the 2nd method. This method solves two problems, one is load balancing, because each unit is processed by multiple nodes, and the other is to prevent a single point of failure. If we follow mode 1 , any node failure will affect our system services, if mode 2, any node failure will not affect our whole.

There are several technical problems. The first problem is that there will be a delay in publishing: the multi-system of star users and his fans takes a long time to process;

Consider how the system can be improved. The first is push mode, which is definitely the number one cause of lag, and we're going to get that out of the way. Secondly, we have more and more users. This database table ranges from one million to one hundred million, and the data size is different in the way of processing. Our first version of the single-database single-table model, when the number of users increases, it needs to be split if it cannot meet the requirements. The second is the problem of locking the table, we are thinking about changing the engine. Another is that the publication is too slow, we consider the asynchronous mode.

In the second edition, we carried out modularization. We first made a layer, and the bottom layer was called the base layer. First, the data was split. On the far right of the figure, the asynchronous mode was published. In the second service layer, we design the basic units of Weibo into service layer modules one by one. The biggest improvement is to improve the push mode. Let's first look at the optimization of the delivery mode. First of all, we have to think about the push mode. If we make improvements to divide users into valid and invalid users. For example, one of our users has 100 followers. When I post a Weibo, I don’t need to push it to 100 followers, because there may be 50 followers who won’t watch it right away. Pushing to them synchronously is equivalent to doing nothing. After we divide users into valid and invalid users, we differentiate them. For example, if we divide the users who have logged in that day into valid users, we only need to send them to the fans who have logged in that day, so that the pressure will be relieved immediately, and the delivery delay will be delayed. also decreased.

There are many ways to split data. The most common method for many Internet products, for example, can be split according to the user's UID . However, one of the characteristics of Weibo users is that everyone accesses the most recent data, so we consider the Weibo data to be split by time, for example, a table per month, which solves the problem that our different time dimensions can be There are different ways of splitting. The second consideration is to keep the content separate from the index. Suppose the uid published by a microblog, the microblog id is the index data, and the content of 140 words is the content data. If we separate, the content simply becomes a key-value method, and key-value is the easiest data to expand. The splitting of index data is challenging. For example, a user has published a thousand microblogs, and the front end of our interface needs to be accessed by paging. For example, if a user needs to access the fifth page, then we need to quickly locate this record. .

Asynchronous processing, publishing is a very heavy operation. It needs to be stored, indexed, and entered into the background. If we want to complete all indexes, the user needs to wait for a long time at the front end. If one link fails, the user will get The prompt is that the publication failed, but the storage has been successful, which will bring about data inconsistency. So we made an asynchronous operation, that is, if the publication is successful, we will prompt the success, and then slowly finish the message queue in the background.

For static content, in the first step, we use CDN to accelerate. In addition, due to the pressure and peak of data, we need to split data, functions, and deployment as much as possible, and then perform capacity planning in advance.

In the third edition, first of all, we divided the underlying things into basic services. The basic services have distributed storage, and we have done some decentralized and automated operations. There are platform services on top of the basic services. We make various small services for the applications commonly used in Weibo. Then we also have application services, which are designed to consider the needs of various applications on the platform.

The platform service and the application service are separated, which realizes the module isolation. Even if the application service access volume is too large, the platform service will not be affected first. In addition, we have improved the engine of Weibo and implemented a hierarchical relationship. The user's attention relationship, we changed to a multi-dimensional index structure, the performance is greatly improved. The fourth level is the improvement of the counter

 

Distributed storage needs to solve a many-to-many data replication

There are three strategies for replication. The first one is Master/Slave , but it also has two disadvantages. The first one is that the Master is centralized. If the Master is in Beijing, it will be very slow to visit Guangzhou. The second disadvantage is that there is a single point of risk. For example, if the Master is in Beijing, can he immediately move to Guangzhou? In this way, the data with a time window is lost, and manual intervention is required, and there is a large delay problem for daily users in Guangzhou to access the Master in Beijing , so generally speaking, it is very good to not consider the first one. program. The second is the Multi-Master scheme, which needs to be applied to avoid conflicts, that is, we cannot change multiple places. This is not particularly difficult for Weibo. Our users usually only publish Weibo in one place. Users will not publish or modify their own information in Guangzhou and Beijing at the same time. In this case, our application has avoided it. This situation. The third is that Paxos can achieve strong consistent writing, that is, if a piece of data is successful, it must be successful in multiple computer rooms, which is obviously a very large delay. So to sum up , Multi-Master is the most mature strategy, but it does not have a mature product now, because there is really no.

The front-end application writes the data to the database, and then passes a message broker, which is equivalent to broadcasting the data to multiple computer rooms through a technology developed by ourselves. This can not only do two computer rooms, but also three or four. The specific method is to distribute the data to multiple points through message broadcasting, that is to say, our data is submitted to an agent, and the agent helps us synchronize the data to multiple computer rooms, so our application does not need to care how the data is synchronized in the past. of.

  What is the benefit of using this message broker? Can you see how Yahoo did it? The first is that the data will not disappear if it is not written to the db after it is provided. I only need to submit the data successfully, and I don't need to care about how the data reaches the computer room. The second feature YMB is a message broker product, but its only magic point is that it is designed for wide area networks. It can classify multi-machine room applications into the interior, and our application does not need to pay attention to this problem. This principle is similar to the technology we are currently developing ourselves.

How does the push architecture achieve real-time performance from the bottom layer of the architecture. After a microblog from the upper left corner is released in our system, we put it in a message queue, and then a message queue handler will take it and put it in the db after processing . If we don't do persistence, because the data we push cannot be lost, we have to write a very complicated program to store the data asynchronously, which will be very complicated, and the system will also have unstable factors. From another point of view, we have also done tests for persistence. We can push the entire process between 100 milliseconds and 200 milliseconds, which means that we can push data out at this time.

 

Platform Security Section. Since our interface is completely open, we need to prevent many malicious behaviors. Many people worry that our interface is open, whether someone is sending spam advertisements or brushing fans through this interface, and how can our technical architecture prevent this? Woolen cloth? This is our security architecture, which does three things. At the top, we have a real-time processing, such as judging according to the frequency and similarity of content, to judge whether it is an advertisement or spam. The middle one is a log processor. We will judge based on some behaviors. For example, if we only intercept in real time, some behaviors are difficult to prevent. We have made an offline correction module. Advertising, we can remove these people after the fact to ensure the health of our platform. Finally, the security of the content is ensured through the dimension of monitoring. The current structure of content security is probably the 541 system, which means that our real-time interception can achieve 50% prevention, and offline analysis can achieve about 40% prevention.

 

The Weibo platform needs to provide users with safe and good experience applications, as well as create a fair environment for developers, so our interface needs clear and safe rules. Calling our interface from an APP requires several layers and needs to be divided into different business modules. The second is the security layer. The third is the authority layer. These are the two dimensions of our platform security, one is interface security and the other is content security.

 

Microblogging Technology Architecture

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326896954&siteId=291194637