Brief Analysis of Cat Principle


This article is a brief analysis of the principle of the Cat link tracking monitoring tool, which mainly refers to official documents and other materials:


Link Tracking System Design Ideas

Link tracking system design needs to consider three aspects:

  • event log
    • output
    • collection and buffering
    • Processing and Polymerization
    • storage and query
  • link tracking
    • trace and span
    • Data collection method
  • aggregated metrics
    • metrics collection
    • stored query
    • Monitoring alarm

This section mainly discusses link tracking. The difficulty of link tracking lies in:

  • How to quickly collect and filter the required logs from a large number of discrete logs, and connect them in series according to the link execution process for visual display, that is, visual full link log tracking

Visual full link log tracking needs to solve two problems:

  • How to Organize Business Logs Efficiently
    • In order to achieve efficient business tracking, it is first necessary to accurately and completely describe the business logic to form a panorama of business logic, and business tracking is actually to restore the scene of business execution in the panorama through the log data during execution.
  • How to Dynamically Concatenate Service Logs
    • The log data during business logic execution is originally stored discretely, but what needs to be realized at this time is to dynamically connect the logs of each logical node with the execution of business logic, and then restore the complete business logic execution site.
      insert image description here

How to Organize Business Logs Efficiently

By abstracting business logic, business logic links are defined:

  • Logical nodes : Many logics of the business system can be split according to business functions to form independent business logic units, that is, logical nodes, which can be local methods or remote calling methods such as RPC.
  • Logical link : The business system supports numerous business scenarios externally, and each business scenario corresponds to a complete business process, which can be abstracted into a logical link composed of logical nodes.

A business trace is the restoration of a certain execution of a logical link. The logical link completely and accurately describes the business logic panorama, and as a carrier, it can realize the efficient organization of business logs.

insert image description here


How to Dynamically Concatenate Service Logs

Since logical nodes and logical nodes often interact through MQ or RPC, etc., the distributed parameter transparent transmission capability provided by distributed session tracking can be used to realize the dynamic concatenation of business logs:

  • Through the continuous transparent transmission of parameters in the execution thread and network communication, the identification of links and nodes is transmitted uninterruptedly while the business logic is executed, and the coloring of discrete logs is realized.
  • Based on the identification, the dyed discrete logs will be dynamically connected in series to the nodes being executed, gradually converging a complete logical link, and finally realizing the efficient organization and visual display of the business execution site.

与分布式会话跟踪方案不同的是,当同时串联多次分布式调用时,需要结合业务逻辑选取一个公共id作为标识

For example, the above audit scenario involves two RPC calls. In order to ensure that the two executions are connected in series to the same logical link, at this time, in combination with the audit business scenario, select the same "task id" as the identifier for the initial audit and re-examination to completely realize the audit scenario Logical links in series and perform an in-place restore.


general solution

After clarifying the two basic issues of efficient organization and dynamic concatenation of logs, the general solution can be disassembled into the following steps:
insert image description here


link definition

"Link Definition" means: 使用特定语言,静态描述完整的逻辑链路,链路通常由多个逻辑节点,按照一定的业务规则组合而成,业务规则即各个逻辑节点之间存在的执行关系,包括串行、并行、条件分支.

DSL(Domain Specific Language)It is a computer language specially designed to solve a certain type of task. It can define the combination relationship (business rule) of a series of nodes (logical nodes) through JSON or XML. Therefore, this solution chooses to use DSL to describe the logical link, so as to realize the logical link from abstract definition to concrete realization.

insert image description here

  • Logical Link 1 - DSL:
[
    {
    
    
      "nodeName": "A",
      "nodeType": "rpc"
    },
    {
    
    
      "nodeName": "Fork",
      "nodeType": "fork",
      "forkNodes": [
        [
          {
    
    
            "nodeName": "B",
            "nodeType": "rpc"
          }
        ],
        [
          {
    
    
            "nodeName": "C",
            "nodeType": "local"
          }
        ]
      ]
    },
    {
    
    
      "nodeName": "Join",
      "nodeType": "join",
      "joinOnList": [
        "B",
        "C"
      ]
    },
    {
    
    
      "nodeName": "D",
      "nodeType": "decision",
      "decisionCases": {
    
    
        "true": [
          {
    
    
            "nodeName": "E",
            "nodeType": "rpc"
          }
        ]
      },
      "defaultCase": [
        {
    
    
          "nodeName": "F",
          "nodeType": "rpc"
        }
      ]
    }
  ]

link coloring

The meaning of "link dyeing" is: in the process of link execution, through transparent transmission and serial identification, it is clear which link is being executed and which node has been executed.

Link coloring consists of two steps:

  • Step 1: Determine the series identifier . When the logical link is opened, determine the unique identifier, which can clarify the subsequent link and node to be executed.
    • Link unique identifier = business identifier + scenario identifier + execution identifier (the three identifiers jointly determine "a certain execution under a certain business scenario")
    • Service ID: Give the link service meaning, such as "user id", "activity id" and so on.
    • Scene ID: Give the meaning of the link scene, for example, the current scene is "logical link 1".
    • Execution ID: Give the link execution meaning. For example, if only a single call is involved, you can directly select "traceId"; if multiple calls are involved, select the same "public id" for multiple calls according to business logic.
    • Node unique identifier = link unique identifier + node name (the two identifiers jointly determine "a logical node in a certain execution in a certain business scenario")
    • Node name: the unique name of the node preset in DSL, such as "A".
  • Step 2: Transmit the series identifier . When the logical link is executed, the serial identifier is transparently transmitted in the distributed complete link, and the nodes that have been executed in the dynamic serial link are dynamically connected to realize the coloring of the link. For example in "Logical Link 1":
    • When the "A" node triggers the execution, it starts to transmit the serial identification in the subsequent links and nodes, and gradually completes the coloring of the entire link as the business process is executed.
    • When the identifier is passed to the "E" node, it means that the judgment result of the "D" conditional branch is "true", and at the same time, the "E" node is dynamically connected in series to the executed link.

link reporting

The meaning of "link reporting" is: during the link execution process, the log is reported in the form of link organization, so as to realize accurate preservation of the business site.

insert image description here
The reported log data includes: node logs and business logs.

  • The function of the node log is to draw the executed nodes in the link, and record the start, end, input and output of the node;
  • The function of the business log is to display the execution of the specific business logic of the link node, and record any data that interprets the business logic, including input and output parameters interacting with upstream and downstream, intermediate variables of complex logic, and logic execution throws exception.

link storage

The meaning of "link storage" is: to store the logs reported during the link execution and use them for subsequent "on-site restoration". Reported logs can be divided into three categories: link logs, node logs, and business logs:

  • Link log : In a single execution of the link, the basic information of the link is extracted from the logs of the start node and the end node, including link type, link meta information, link start/end time, etc.
  • Node log : In a single execution of the link, the basic information of the executed node, including node name, node status, node start/end time, etc.
  • Business log : In a single execution of the link, the business log information in the executed node, including log level, log time, log data, etc.

The following figure is the storage model of link storage, which includes link logs, node logs, service logs, and link metadata (configuration data), and is a tree structure as shown in the figure below, in which the service identifier is used as the root node, using for subsequent link queries.

insert image description here


Cat principle

The overall requirement of monitoring is to quickly find faults, quickly locate faults, and assist in program performance optimization. In order to do this, the monitoring system needs to have the following requirements:

  • Real-time processing: The value of information diminishes over time, especially during incident handling.
  • Full amount of data: The initial design goal is to collect full amount of data. There are many benefits of full amount of data.
  • High availability: When all applications are down, they need to be monitored and still stand, and tell engineers what happened, so as to restore faults and locate problems.
  • Fault tolerance: The failure of CAT itself should not affect the normal operation of the business. When the CAT is down, the application should not be affected, but the monitoring capability is temporarily weakened.
  • High throughput: In order to restore the truth, all-round monitoring and measurement are required, and super processing throughput capabilities are necessary.
  • Scalable: Supports distributed, cross-IDC deployment, and horizontally scalable monitoring systems.
  • Reliability is not guaranteed: message loss is allowed, which is a very important trade-off. Currently, the CAT server can achieve 4 nines of reliability. The design of a reliable system and an unreliable system is very different.

Since the development of the entire CAT, it has been adhering to the principle that a simple architecture is the best architecture. The entire CAT is mainly divided into three modules.cat-client,cat-consumer,cat-home。

  • cat-client provides the underlying sdk for business and middle layer burying.
  • cat-consumer is used to analyze data provided from clients in real time.
  • cat-home acts as a control for user-to-user presentations.

In actual development and deployment, cat-consumer and cat-home are deployed inside a jvm, and each CAT server can be used as a consumer or as a home, which can not only reduce the entire CAT hierarchy, but also increase the stability of the entire system sex.


Client principle

Client design is the most core part of CAT system design. The client requires simple API and high reliability, because monitoring is only a bypass link of the company's core business process, and it cannot affect business performance in any scenario.

The CAT client uses ThreadLocal (thread local variable) in collecting end data, which is a thread local variable and can also be called thread local storage. In fact, the function of ThreadLocal is very simple. It is to provide a copy of the variable value for each thread that uses the variable. It belongs to a special thread binding mechanism in Java. Each thread can change its own copy independently. Will conflict with copies of other threads.

In the monitoring scenario, the services provided to users are all web containers, such as tomcat or Jetty, and the back-end RPC servers such as Dubbo or Pigeon are also implemented based on thread pools. When the business side processes business logic, it basically calls back-end services, databases, caches, etc. within a thread, takes these data back and encapsulates the business logic, and finally displays the results to the user. So it is very appropriate to store all monitoring requests as a monitoring context into thread variables.
insert image description here
As shown in the figure above, when the business executes the business logic, the monitoring corresponding to the request will be stored in the thread context, which is actually a monitoring tree structure. At the end of the execution of the last business thread, the monitoring object is stored in an asynchronous memory queue, and CAT has a consumer thread to asynchronously send the data in the queue to the server.

The summary process is as follows:

  • The business thread generates a message and hands it to the message producer, and the message producer stores the message in the message stack of the business thread ;
  • When the business thread notifies the message Producer that the message is over, the message Producer generates a message tree according to its message stack and places it in the synchronous message queue;
  • The message reporting thread monitors the message queue, generates the final message message according to the message tree and reports it to the CAT server.

API design

The definition of monitoring API often depends on the understanding of the field of monitoring or performance analysis. The scenarios for monitoring and performance analysis are as follows:

  • The execution time of a piece of code. A piece of code can be time-consuming for URL execution or SQL execution time.
  • The number of times a piece of code is executed, such as the number of times Java throws an exception record, or the number of times a piece of logic is executed.
  • Execute a certain piece of code regularly, such as regularly reporting some core indicators: JVM memory, GC and other indicators.
  • Key business monitoring indicators, such as monitoring the number of orders, transaction volume, payment success rate, etc.

On the basis of the above domain model, CAT designs several core monitoring objects: Transaction, Event, Heartbeat, Metric.

A code example of a monitoring API is as follows:

insert image description here


Ordered sum communication

Serialization and communication are a critical link in the performance of the entire client, including the server.

  • The CAT serialization protocol is a custom serialization protocol. Compared with the general serialization protocol, the custom serialization protocol is much more efficient. This is still very necessary in the scenario of large-scale data real-time processing.
  • CAT communication is based on Netty to realize NIO data transmission. Netty is a very good NIO development framework, so I won’t introduce it in detail here.

Client buried point

Log burying is one of the most important links in monitoring activities, and the quality of logs determines the quality and efficiency of monitoring. The current CAT burying goal is problem-centered, and exceptions thrown by programs are typical problems.

My personal definition of a problem is: if it does not meet expectations, it can be considered a problem, such as unfinished requests, fast or slow response times, more or less request TPS, uneven time distribution, etc.

In the Internet environment, the most prominent problem scenario, the outstanding understanding is: Behavior across borders. including but not limited to:

  • HTTP/REST、RPC/SOA、MQ、Job、Cache、DAL;
  • Search/query engines, business applications, outsourced systems, legacy systems;
  • Between third-party gateways/banks, partners/suppliers;
  • Various business indicators, such as user login, number of orders, payment status, sales.

Core class analysis

Cat uses a message tree (MessageTree) to organize logs. The following is the class definition of the message tree: the
insert image description here
entity we operate each time is a message tree, which has a domain field, which is a very important concept in cat. A domain can be corresponding to A project, each message tree has a unique MessageId, different message trees (for example, A service calls B service in microservices, A and B will generate a message tree) are connected in series through parenMessageId and rootMessageId, all entities under the message tree are It is Message. There are 5 types of Message in total, namely Transaction, Event, Trace, Metric and Heartbeat.

  • Transaction: It can be understood as a transaction. Transactions can be nested with each other. Transactions can also nest any other message types, stored in the List m_children member variable, and only transactions can be nested. It is generally used to record program access behaviors across system boundaries, such as remote calls and database calls, and is also suitable for business logic monitoring with a long execution time.

  • Event: It means that the system is an event that occurs at a certain point in time, such as new user registration, login, system exception, etc. In theory, it can record anything. Compared with transaction, it lacks time statistics, and the overhead is smaller than transaction. It can also be used to record the relationship between two transactions. The branch transaction maintains the relationship with the main transaction message by setting the parentMessageId of the message tree.

  • Trace: Used to record information such as trace and debug, such as log4j printing logs. For quick debugging and locating problems

  • Metric: Used to record business indicators, indicators may include the number of records for an indicator, the average value of records, and the sum of records

  • Heartbeat: It is mainly used to record the heartbeat information of the system, such as CPU%, MEM%, connection pool status, system load, etc.

insert image description here


Process analysis

Startup process:

Creating a transaction first obtains the MessageProducer object of the message producer through the getProducer function. Before returning the MessageProducer object, the function initializes the client, sets the CatHome directory, and the default is /data/appdatas/cat, reads the configuration file client.xml, and uses The Plexus container loads the corresponding modules:

insert image description here
insert image description here
insert image description here


news production

After we get the message producer object MessageProducer, we can call newTransaction(type, name) to create a Transaction class message,

It is worth noting that the MessageProducer encapsulates all the internal details of the CAT for the business, so the business side only needs one MessageProducer object to complete all operations on the message.
insert image description here

The specific steps to create a message are as follows:

  1. He first judges whether there is a message context context through the message manager MassageManager, and creates a message context in the setup if it does not exist.

insert image description here


Context thread local variables

The message context Context uses thread-local variables. Access Context data through ThreadLocal.

This method is usually used to print logs under high concurrency, or to print the logs of a transaction together, because generally a transaction is executed by the same thread by default (such as an http request), and the transaction log is saved locally in the thread Among the variables, when the transaction execution is completed, it will be printed uniformly.

Why do we need thread local variables? Under low concurrent requests, a log will be processed quickly, and ordinary variables can meet the requirements. It is rare for multiple threads to read and write the same variable at the same time.

However, in a high-concurrency scenario, multiple threads reading and writing the same variable at the same time will lead to unpredictable results. We call this thread unsafe. For example, thread A needs to write a large log, and when half of the writing is done, thread B gets CPU execution. When the time slice starts to write logs, AB's logs will be intertwined and confused. Some students will ask, why not use synchronization locks? This is a solution. Synchronization lock is a relatively complex way to ensure thread safety. It ensures that only one thread can read and write variables at the same time. Other threads need to queue up to read and write variables, which will inevitably bring high latency.

The function of the thread local variable is very simple, that is to provide a copy of the variable value for each thread that uses the variable. It is a special thread binding mechanism in Java. The JVM binds a private thread for each running thread. The local instance access space, each thread can change its own copy independently, without conflicting with the copies of other threads, thus providing an isolation mechanism for concurrent access problems that often occur in multi-threaded environments, but it will cause Data redundancy is a thread-safe solution that trades space for time.

  1. Create context:

insert image description here
The constructor of Context:
insert image description here

In the Context constructor, we can see that the message tree MessageTree and Transaction stack are created. Since Context is a thread local variable, it can be inferred that each thread has its own message tree and transaction stack. The thread mentioned here All are business threads, and Context belongs to the inner class of MessageManager.

It can be considered that one of the functions of the MessageManager is as a proxy of the context. The core of the start, add, and end methods of the MessageManager are to call the start, add, and end methods of the current thread context.


The opening of Transaction

insert image description here
Then MessageProducer will create a Transation object, and then give the Transaction object to MessageManager to start.


1. Add Transaction to the context —> focus on the ctx.start method

insert image description here

  1. Add Transaction to DefaultMessageTree in Context

insert image description here

  • If m_stack is not empty, and the transaction type is not ForkedTransaction

    • Calculate the time or length condition, if it needs to be sent to the server, send it to the server (truncateAndFlush)
    • Add the current transaction to the sub-message of the top element of m_stack.
    • m_length++
  • If m_stack is empty, add the current Transaction to MessageTree.

  • Finally, judge whether the transaction is a forked transaction, if not, add the transaction to m_stack.


Combination of other types of messages

@RunWith(JUnit4.class)
public class AppSimulator extends CatTestCase {
    
    
    @Test
    public void simulateHierarchyTransaction() throws Exception {
    
    
        MessageProducer cat = Cat.getProducer();
        Transaction t = cat.newTransaction("URL", "WebPage");
        String id1 = cat.createMessageId();
        String id2 = cat.createMessageId();
 
 
        try {
    
    
            // do your business here
            t.addData("k1", "v1");
            t.addData("k2", "v2");
            t.addData("k3", "v3");
            Thread.sleep(5);
 
 
            cat.logMetric("payCount", "C", "1");
            cat.logMetric("totalfee", "S", "30.5");
            cat.logMetric("avgfee", "T", "25.6");
            cat.logMetric("order", "S,C", "3,25.6");
 
 
            Metric event = Cat.getProducer().newMetric("kingsoft", "praise");
            event.setStatus("C");
            event.addData("3");
            event.complete();
 
 
            Cat.getManager().setTraceMode(true);
            cat.logTrace("Trace1", "debug", SUCCESS, "user_debug_data");
 
 
            cat.logEvent("RuntimeException", "Name1", "ERROR", "data1");
            cat.logEvent("Error", "Name2", SUCCESS, "data2");
 
 
            cat.logEvent("RemoteCall", "Service1", SUCCESS, id1);
            t.setStatus(SUCCESS);
        } catch (Exception e) {
    
    
            t.setStatus(e);
        } finally {
    
    
            t.complete();
        }
    }
}

Event type messages can be recorded through MessageProducer's logEvent. The method first calls the newEvent method to create an Event object. If there is message data, use the addData method to add data, then setStatus to set the message status, and complete to complete the log record.

public class DefaultMessageProducer implements MessageProducer {
    
    
    @Override
    public void logEvent(String type, String name, String status, String nameValuePairs) {
    
    
        Event event = newEvent(type, name);
 
        if (nameValuePairs != null && nameValuePairs.length() > 0) {
    
    
            event.addData(nameValuePairs);
        }
 
        event.setStatus(status);
        event.complete();
    }
    
    @Override
    public Event newEvent(String type, String name) {
    
    
        if (!m_manager.hasContext()) {
    
    
            m_manager.setup();
        }
 
        if (m_manager.isMessageEnabled()) {
    
    
            DefaultEvent event = new DefaultEvent(type, name, m_manager);
 
            return event;
        } else {
    
    
            return NullMessage.EVENT;
        }
    }
}

What does event.complet do? He will first set the complete state of the message to true, then call the add method of MessageManager, and pass in its own pointer. In the chapter of Context thread local variables, it is said that MessageManager is the proxy of context. The core of the add method of MessageManager is the called context add method.

The add method of the context will first judge whether the m_stack stack is empty, if it is empty, it means that the message is a separate non-transactional message, directly put the message into MessageTree and send it to the server.

If m_stack is not empty, it means that the event message is under a transaction. We get the transaction from the top of the m_stack stack, nest the event message into the transaction, and push it to the server together when the transaction ends. This is the case in the above case.

class Context {
    
    
    public void add(Message message) {
    
    
        if (m_stack.isEmpty()) {
    
    
            MessageTree tree = m_tree.copy();
 
            tree.setMessage(message);
            flush(tree);
        } else {
    
    
            Transaction parent = m_stack.peek();
 
            addTransactionChild(message, parent);
        }
    }
}

We can also not use logEvent to record logs, but create Event message instances through newEvent, and then control when to add data, setStatus and complete messages by ourselves.

The message operation process of Heartbeat, Metric, and Trace categories is basically the same as that of Event messages. Trace messages can only be used if the MessageManager enables the TradeMode tracking mode. Similar to the Debug mode we are developing, call the Cat.getManager().setTraceMode(true) method. Turn on tracking mode.


Close Transaction:

insert image description here
insert image description here
The end method of Contex will pop up the transaction from the top of the stack. If the popped up transaction is not equal to the transaction passed in by the end method, it is considered that the popped up transaction is not the transaction we need to end, but a nested subtransaction, and we continue to pop up the next one The top element of the stack, that is, the parent transaction, until the transaction we need to end is popped up. In this process, validate will be called to verify the transaction.

Then we judge whether the stack is empty. If it is empty, the transaction passed by end is considered as the root transaction. At this time, we call m_manager.flush to report the message tree to the server.
insert image description here

Here we need to introduce that after the message enters the context, it is stored by means of a stack:
insert image description here

Context is stored in the form of ThreadLocal, so each business thread has its own Context, and Context is still an internal class of Prducer

insert image description here

There are references between Transactions, so in the end method, only the first Transaction (encapsulated in MessageTree) needs to be flushed through MessageManager, and all Transactions can be found according to this reference relationship when splicing messages. So look at the code:

insert image description here


send data

MessageManager will report the message tree to the server through flush. Let's analyze the flush method through the following source code. The function first determines whether to allocate MessageID, if not, allocate it, and then calls the send function of TcpSocketSender to send the message.

The send function is not sent immediately, but just inserted into the memory queue. Readers can take a look at the initialize() method of TcpSocketSender, there is a line of code Threads.forGroup(“cat”).start(this), this line of code makes the client start a reporting thread when it is initialized, and the reporting thread keeps reading Get the memory queue, get the message tree to be sent, and call the sendInternal(MessageTree tree) method to send the message tree to the server.

In this way, the client realizes multi-threading, asynchronization, and queuing of messages, so as to ensure that the log records will not affect the main business thread due to the abnormality of the CAT system.
insert image description here


1. First get the object of the sending class, call its method to send:

insert image description here
2. When sending, it is a classic producer-consumer model. The producer only needs to put data into the queue, and the consumer listens to the queue, gets the data and sends it:
insert image description here

3. The consumer thread pulls the message:
insert image description here


message serialization

The reporting thread sends the message to the server through sendInternal(MessageTree tree). In the sendInternal method, TcpSocketSender will call m_codec.encode(tree, buf) to serialize the message tree before sending the message. Serialization is to encode the object It is a technology that enables objects to be sent to the server through the tcp/ip protocol as a set of bytes, and the server decodes the bytes into objects through deserialization.

In Java, as long as a class implements the java.io.Serializable interface, it can be serialized. However, the bytes encoded through the public interface will have a lot of redundant information to ensure the correct encoding and decoding between different objects and bytes. In CAT, only the MessageTree object needs to be transmitted. A lot of unnecessary byte information can be saved through a custom serialization scheme to ensure the efficiency of network transmission.

insert image description here

public class PlainTextMessageCodec implements MessageCodec, LogEnabled {
    
    
    @Override
    public void encode(MessageTree tree, ByteBuf buf) throws UnsupportedEncodingException {
    
    
        int count = 0;
        int index = buf.writerIndex();
 
        buf.writeInt(0); // place-holder
 
        count += encodeHeader(tree, buf);
 
        if (tree.getMessage() != null) {
    
    
            count += encodeMessage(tree.getMessage(), buf);
        }
 
        buf.setInt(index, count);
    }
}

The serialized bytecode consists of 3 parts:

1. The first 4 bytes contain the length of the entire group of byte strings. First, use buf.writeInt(0) to occupy the space. After encoding, use buf.setInt(index, count) to write the bytecode length into the first 4 words of buf Festival.

2. Encode the header of the message tree, and write the tree’s version, domain, hostName, ipAdress, treadGroupName, treadId, threadName, MessageId, parentMessageId, rootMessageId, sessionToken into the header in turn, and the fields are separated by "\t", and End with "\n". Empty is represented by null.

3. Encode the message body, each message begins with a character indicating the message type.

a."A"表示没有嵌套其他类型消息的事务,
b.有嵌套其他消息的事务,以一个 "t" 开头,然后递归去遍历并编码子消息, 最后以一个"T"结束,
c."E"/"L"/"M"/"H"分别表示Event/Trace/Metric/Heartbeat类型消息;
  • Then record time, type, name in turn
  • Then write status, duration+us, data in sequence according to the conditions
  • Fields are still separated by "\t", end with "\n", empty is represented by null

For example, in the case of other message combination chapters above, after MessageTree is encoded:

  口PT1	Cat	Win7-caoh.kingsoft.cn	192.168.37.41	main	1	main	Cat-c0a82529-423686-40028	null	null	null
t2018-05-02 22:59:05.347	URL	WebPage	
H2018-05-02 22:59:05.353	Heartbeat1	hearbeat	0	cpu=90&mem=70	
M2018-05-02 22:59:05.353		metric1	0	total_fee	
L2018-05-02 22:59:05.354	Trace1	debug	0	user_debug_data	
E2018-05-02 22:59:05.354	Event1	Name1	0	data1	
E2018-05-02 22:59:05.354	Event2	Name2	0	data2	
E2018-05-02 22:59:05.354	RemoteCall	Service1	0	Cat-c0a82529-423686-40026	
T2018-05-02 22:59:07.507	URL	WebPage	0	2160695us	k1=v1&k2=v2&k3=v3

The above string of strings is the result of converting bytecodes into strings. The garbled characters at the top actually represent the representation of 4 bytes of int type converted to string type. After the bytecode is converted to int, it is 541, which is the length of the entire bytecode.

Finally, TcpSocketSender sends the encoded bytecode to the server through ChannelManager. The netty client is used here.


MessageID

Each message in CAT has a unique ID. This ID is generated on the client side, and the content of the message is subsequently searched through this ID. Typical RPC message stringing problem, for example, when A calls B, a Message-ID is generated on A's side, and when A calls B, the Message-ID is passed to B as a call, and during B's execution , B uses the Message-ID passed by the context as the Message-ID of the current monitoring message.

The Message-ID format of the CAT message is ShopWeb-0a010680-375030-2, and the CAT message is divided into four sections:

  • The first paragraph is the application name shop-web.
  • The second paragraph is the hexadecimal format of the IP of the current machine, 01010680 means 10.1.6.108.
  • 375030 in the third paragraph is the whole point obtained by dividing the current time of the system by the hour.
  • The 2 in the fourth paragraph indicates the sequential increment number of the current client in the current hour.

It must be noted that the fourth segment of the Message-ID generated by the same client machine, that is, the sequentially incremented number of the current hour, must not be repeated within the current hour, because on the server side, CAT will create an IP address for each client. 1. An index file is created for each hour of original message storage. The offset position of the index record of each message in the index file is determined by the sequential increment number. Once the sequence number is generated repeatedly, the repeated index data of the hour will be will be overwritten, so that we cannot find the original message data through the index.


Server principle

The stand-alone consumer architecture is designed as follows:
insert image description here
As shown in the figure above, the CAT server basically realizes fully asynchronous processing in the entire real-time processing.

  • Message acceptance is based on Netty's NIO implementation.
  • When the message is received by the server, it is stored in the memory queue, and then the program starts a thread to consume the message for message distribution.
  • Each message will have a batch of threads concurrently consuming data in their respective queues to isolate message processing.
  • Message storage is first stored in the local disk, and then asynchronously uploaded to HDFS files, which also avoids strong dependence on HDFS.

When a certain report processor is too late to process, for example, Transaction report processing is slow, multiple Transaction processing threads can be enabled through configuration support to consume messages concurrently.

insert image description here


storage data design

Message storage is the most challenging part of CAT. The key problem is that the number of messages is large and large. Currently, Meituan processes about 100 billion messages per day, with a size of about 100TB. A single physical machine needs to process about 100MB of traffic per second during the peak period. The CAT server performs real-time calculations based on this traffic, and also needs to compress the data and write it to disk.

The overall storage structure is as follows:

insert image description here
When CAT is writing data, one is the Index file and the other is the Data file.

  • The Data file is segmented GZIP compressed, and the size of each segment is less than 64K, so that 16 bits can be used to represent a maximum segment address.
  • A Message-ID needs 48 bits to store the index, and the index determines the position of the index according to the fourth paragraph of the Message-ID. For example, the message Message-ID is ShopWeb-0a010680-375030-2, and the index corresponding to this message ID The position is the position of 2*48bits.
  • The first 32 bits of 48 bits store the block offset address of the data file, and the last 16 bits store the address offset within the block after the data file is decompressed.
  • When CAT reads a message, it first determines the unique index file according to the first three paragraphs of the Message-ID, then determines the index position of the Message-ID according to the fourth paragraph of the Message-ID, and reads the content of the data file according to the 48 bits of the index file. Then decompress the data file with GZIP, and read the real message content according to the offset address in the block.

summary

When you are learning the principle of Cat client, you can compare it with the general solution four steps given at the beginning to see the difference and connection between theory and practice.

For more Cat source codes, please refer to this series: Cat source code series , the analysis of the boss is very thorough. This article also draws a lot from the client source code articles in this series in the client source code analysis part. For the server-side principles article and other parts, this article It’s just a brief introduction. For more details, you can refer to the source code series.

Guess you like

Origin blog.csdn.net/m0_53157173/article/details/130168290