Network flow analysis of flow collection to flow restoration

      Network traffic runs through all links of business circulation, from our personal PCs and mobile phones, to IDC data centers, WEB and APP applications, etc., all need to complete data interaction through network traffic. Therefore, according to the simple principle of "As long as there is an attack, there will be traffic generation" , whether it is anti-DDoS at the network layer, anti -intrusion at the host layer or anti-vulnerability at the application layer , it is all based on network traffic analysis based on NTA network traffic analysis. Of security products always occupy a pivotal position in the security field.

      Based on the recent experience related to IDS intrusion detection traffic analysis, based on the common application forms of traffic analysis, this article summarizes the common technical means of traffic analysis, including the methods and tools commonly used in traffic collection, and the basic principles of traffic restoration technology.

Why do you need traffic analysis?

      要了解网络真实的运行情况,及时发现运行中存在的问题,必须对网络流量有一个全面了解。不同的应用层,流量分析起到的作用不同,比如运营商需要通过分析用户网络流量来计算网络消费、掌握用户对其他运营商的访问情况,为网络出口互联链路的设置提供决策数据支撑;业务应用层如网站提供方通过流量分析了解网站访客的数据,如IP地址、浏览器信息等统计网站在线人数,了解用户所访问网站页面,通过分析出异常帮助网站管理员知道是否有滥用或者攻击现象了解网站使用情况,提前应对网站服务器系统的负载问题等;而安全监测领域则通过流量分析实现对网络异常通信的监测,防范常见的网络入侵、DDOS攻击和疆木蠕感染传播等。

如何进行流量分析?

      面对复杂多变的规模庞大的网络环境,需要一个能够适应不同环境和高效分析处理的系统。首先我们需要对不同的采集技术有初步的认识。

网络流量分析的常用技术手段:

  1. 基于硬件探针的流量分析技术
    探针是专门用于获取网络链路流量数据的硬件设备。按实现方式可以分为软件架构和硬件架构。使用时是通过交换机流量镜像端口或直接将其串接在待观测的链路上,对链路上所有的数据报文进行处理,提取流量监测所需的协议字段甚至全部报文内容。最大特点是能够提供丰富的从物理层到应用层的详细信息,也就是目前基于NTA技术产品如IDS、NDR等最常用到的方案。
  2. SNMP-based traffic analysis technology
    SNMP (SimpleNetworkManagementProtocol, Simple Network Management Protocol) is usually used to collect basic traffic detailed information, such as bytes/data packets, and realize batch management of network equipment through the method of "using network management network", thereby Improve equipment management efficiency. This method can only analyze the overall traffic of the network device port, can obtain the historical or real-time traffic statistics information of the device port, and cannot analyze the packet type and flow direction information in depth. It has the characteristics of simple implementation, unified standards, and open interfaces.
  3. Netflow-based traffic analysis technology
    NetFlow is a technology developed by Cisco. It is not only a switching technology, but also a traffic analysis technology, and it is also one of the mainstream billing technologies in the industry. NetFlow can sample each data packet at a specific network location, and can make detailed statistics on the time, location, protocol type, number of packets, bytes, and flow of IP traffic. NetFlow will tell you who is consuming bandwidth and the reasons for consuming bandwidth. Compared with SNMP, it is more detailed and is mainly used in backbone network traffic sampling, DDOS attack detection and other large traffic analysis fields .
  4. The traffic analysis technology based on real-time packet capture analysis
    uses software packet capture tools such as wireshark, tcpdump, etc. to perform real-time packet capture and analysis. This is also the most common way for individual users to do network protocol analysis. This method provides more detailed data analysis from the physical layer to the application layer. However, this method mainly focuses on protocol analysis, not user traffic access statistics and trend analysis. It can only analyze data packets flowing through the interface in a short time, and cannot meet the requirements of large traffic, long-term packet capture and trend analysis.

How to restore traffic?

      The original network traffic is presented in a binary format and cannot be directly read and applied. Therefore, it is necessary to use relevant tools and technologies to turn the network traffic into data information that is easier to read. In this process, the collected network traffic must be processed. Decoding and analysis, including identifying the protocols and services in the traffic, extracting the original files in the traffic, etc.

The process of analyzing and restoring traffic data packets is the process of extracting, analyzing and reorganizing fields at various positions in the binary bit stream. A variety of technologies are used, including port matching, traffic feature detection, automatic connection correlation, and behavior feature analysis.

1) Port matching: In the process of network protocol development, a series of standard protocol specifications have been formed, which specify the ports used by different protocols, such as the HTTP protocol of port 80, the DNS protocol of port 53 and so on, and many others are widely used Although there is no other standardization of the application program, it has formed a de facto standard port. Port matching is to identify applications based on these standard or non-standard correspondences and TCP/UDP ports. This method has the advantage of high detection efficiency, and the weakness is that it is easy to be forged. Therefore, on the basis of port detection, it is necessary to add some feature detection judgment and analysis to further analyze this part of the data.
2) Flow feature detection: Compared with ports, the protocols used by different applications also have a lot of commonalities. These commonalities are the so-called flow characteristics. The identification of traffic characteristics can be roughly divided into two types: one is the identification of standard protocols such as HTTP, DNS, and TCP/IP as standard communication protocols. When decoding the data packet link layer, network layer, and transport layer , You only need to refer to the standard format specification for decoding and analysis; the other is the identification of private protocols. If the protocol is based on the standard interface format adopted by the TCP or UDP protocol, you only need to refer to the standard format for decoding and analysis, otherwise you may need Through the reverse engineering analysis protocol mechanism, the communication flow can be identified directly or after decryption through the characteristic fields of the message flow.
3) Automatic connection association: With the development of Internet applications, more and more data are transmitted on the Internet, and the mode of completing all tasks on a single connection has gradually begun to appear bottleneck. Therefore, many protocols have begun to use dynamic negotiation ports for transmission. This mode first appeared on the standard FTP protocol, and was gradually used in the transmission of voice, video and files. In order to identify this kind of data, it is necessary to automatically associate and restore the data transmission link according to the message information on the control link. This technology is called automatic connection association.
4) Analysis of behavior characteristics: For some data traffic that is not easy to restore, the method of behavior characteristics can be used for analysis. This method does not try to analyze the data on the link, but uses the statistical characteristics of the link, such as specific feature fields, the number of connections, the connection mode of a single IP, the ratio of upstream and downstream traffic, and the frequency of data packets to distinguish applications. Types of. For example, if you analyze encrypted traffic such as RDP or SSH login, it may not be possible to completely restore the content of the traffic communication, but the specific field in the traffic message can be used to identify that the message is a host login behavior. Once such behavior messages appear abnormally frequently, It can be judged that landing blasting may occur.

Guess you like

Origin blog.csdn.net/liushulin183/article/details/112669200