Understand the architectural essence of the Internet in one article

[Introduction] When talking about the Internet, many people will have various terms and services in their minds, but how is the Internet designed and built? What is the architectural nature of the Internet as a network? Brother Shitou and I once translated a huge book "Computer Network Problems and Solutions" together, but not many friends actually read it carefully and learned something from it. Recently, Brother Stone recommended another article https://cacm.acm.org/magazines/2023/2/268956-extracting-the-essential-simplicity-of-the-internet/fulltext. The content is concise and concise. I dare not Keep it privately, compile and share it with everyone.

Today, the Internet provides ubiquitous connectivity that people rely on. Many people also know that the basic design of the Internet was invented in the 1970s to allow computers to exchange data. Since its successive adoption in 1983, it has remained essentially unchanged while elegantly adapting to fundamental changes in applications, technologies, scale, scope, and capacity, becoming an integral part of our lives.

As such an excellent IT infrastructure, what is its system architecture like?

9a4c5ce2ede58ef0d59877dd1ccf31d0.jpeg

1. Essential needs: Internet service model

All software on the host must rely on the infrastructure's service model when communicating. Therefore, the design of the service model requires a compromise between what is best for the host software and what the infrastructure can support. When designing a data transfer service model, its transfer units must be selected and performance guaranteed. Given the bursty nature of computer communications, a small transmission unit is necessary to achieve efficient resource utilization. The Internet uses packets of bits, and current packet sizes typically do not exceed 1.5 kb.

Internet applications have a wide variety of network performance requirements, ranging from low latency (interactive video or closed-loop control) to high bandwidth (transmitting large data sets) and reliable transmission (file transfer). The engineering instinct is to support coupling these requirements with network infrastructure to guarantee specific bounds on latency, bandwidth, and reliability. However, the Internet must interconnect with a variety of local packet networks, such as wireless and wireline, shared access, and point-to-point, many of which cannot guarantee performance. The lowest common denominator these networks can support is "best effort" packet delivery, where the network attempts to deliver all packets but cannot guarantee if, when, or at what rate packets are delivered.

Therefore, the fundamental question regarding the service model is, is it better for the technology and network to support all application requirements or to accommodate all data packets and only provide a modest lowest common denominator and "best effort" service model. The design of the Internet has chosen a simple approach, that is, a best-effort service model.

Now it seems that this is a better choice. First, a looser service model imposed minimal requirements on infrastructure, allowing the Internet to grow extremely rapidly. In short, the Internet can be deployed with great speed and scope due to low performance requirements. Secondly, the application has strict network requirements that are a continuation of the telephone network. The intelligent terminal system of the telephone network is connected to the intelligent network and is fully responsible for meeting the strict performance requirements of the telephone, that is, fixed rate transmission and reliable transmission. As computers become more and more powerful as terminals, the demands on Internet applications are relaxed by being designed to accommodate different levels of performance. In most cases, network operators can also ensure reasonable performance by providing sufficient bandwidth, that is, by deploying more and faster links so that the network is rarely congested.

081ec5f3234160acde7227076c078be3.jpeg

2. System abstraction: the architecture of the Internet

Architecture is about the organization of functionality, not its implementation. Modularity is a guiding principle of architecture that requires breaking down the goals of a system into smaller tasks with clean interfaces. The goal of the Internet is to allow applications on two different computers to communicate, so it can be broken down into two components: 

(i) The role of network infrastructure (best effort transmission of data packets between hosts)

(ii) The role of network support software on the host (making it easier for applications to use this best-effort transport).

2.1 Network infrastructure

The host-to-host delivery tasks implemented by the infrastructure can be decomposed into three different tasks, which are designed hierarchically, with the higher layers having a wider spatial scope and the lower layers handling more local tasks. Therefore, the process of sending a packet from a sending host to a receiving host involves aggregating these local sending tasks. The local and low-level task of providing optimal packet delivery is transmission over links or broadcasts, which requires digital-to-analog conversion and error correction, neither of which are unique to the Internet. This local delivery task is handled by the so-called physical layer (L1).

Given that L1 is capable of sending a stream of bits over the link, the next task is to enable communication in a local packet network (such as Ethernet or wireless). This involves two steps: 

(i) A data packet is composed of a group of bits, preceded by a data packet header describing where these bits should be sent;

(ii) Deliver these packets to the appropriate destination in the local network. To limit the second step to local (rather than global) transmission, you can use non-scalable techniques such as broadcasting (the medium itself ensures that broadcast packets reach all hosts (like wireless)) and flooding (the network ensures that flooded packets reach all host). This task is handled by what is traditionally called the link layer or L2. In non-broadcast networks, this task is performed by switches in the local network, which forward packets to their destinations.

The final task is to deliver the packets from the sender's network to the destination network, leveraging the capabilities of L2 to deliver packets in these networks to various hosts. The interconnection of such networks is handled by the network interconnection layer (L3) and is implemented by routers at L2 that connect to two or more networks. Packets are forwarded through a series of routers (router-to-router delivery is supported by L2 in the network to which the two routers are connected) until they reach the destination network. This layer of connected networks also defines the service model of the Internet, with L3 delivering data to higher-layer host software.

So the ubiquitous connectivity offered by the Internet began with a conceptually simple yet bold design.

2.2 Network support in host software

The host-to-host service model in the network infrastructure does not specify to the application which host the packet should be delivered, and it is difficult for the application to achieve optimal performance without additional help. To correct these problems, the host operating system provides a transport layer (L4) in addition to other network support. Based on the metadata in the packet header called "port", L4 delivers the packet to individual applications on the host. To make it easier for applications to use best-effort services, some public transport protocols provide three functions.

  1. Bytestream API, applications can use a simple file-like interface to write and read data without having to explicitly send and receive individual packets.

  2. Controlling the packet transmission rate of a host to prevent network overload, this approach is called congestion control transport and involves a control loop in which the transport protocol detects network congestion if it detects network congestion (for example, by missing packets or increasing latency) , their sending rate will decrease. There are many congestion control algorithms that differ in how they detect and respond to congestion.

  3. The most basic is reliable packet delivery, and the application does not need to deal with packet loss. This loss is a natural part of the best service model, but is unacceptable for many applications. Although reliability can be implemented in the application itself or other supporting libraries, for clarity, this reliability is implemented in L4.

The four-layer Internet architecture is a natural outcome of modularity. Each layer only interacts with the layer directly above and below it. Since packets always arrive over the physical media of L1, hosts must implement all four layers; routers implement the first three layers, and switches only implement the first two layers. So here are four questions to answer.

48a3e2c90a9b7500bdf063cc1cf06ab1.jpeg

Question #1: What kind of diversity does this layered architecture support, and how is it managed?

Two implementations of a layer are architecturally interchangeable as long as they provide the same upward and downward interfaces. Furthermore, the modularity of the Internet architecture allows multiple transport protocols to coexist at L4, multiple local network designs to coexist at L2, and multiple physical technologies to coexist at L1, each providing its own unique interface. L1 and L2 technologies are selected by the network provider, who can ensure that their interfaces are compatible, that is, the local network design can use a set of link technologies. An application selects the L4 protocol it requires when calling the operating system's network API. The application and network provider make the selection independently, that is, the application does not know which network it is currently on, and the network provider does not know which applications will be used on its network. This approach works seamlessly if and only if there is a set of interfaces on L3 that all L2 designs and L4 protocols are compatible with. Therefore, we must have a single protocol at L3, which is Internet Protocol (IP), currently IPv4 and IPv6. As the "waist" of the Internet architecture, the uniqueness of IP makes diversified innovation at all other levels possible.

Question #2: What identifiers does the Internet use?

The Internet must enable L2 and L3 packet headers to identify the destination of a route and allow users and applications to identify the services they want to access. This results in three types of addresses. Typically, each connectable hardware network interface on the host (such as Wi-Fi, cellular card, or Ethernet card) has a permanent unique address called a MAC address, which is used by L2 to identify the destination. land. L3 uses IP addresses, which designate a unique network in the Internet, and the specific host interface that is currently using the address on that network (this address assignment can change over time). Users and applications use application-level names to refer to host-based services. To give these names some degree of persistence, they are independent of the machines on which the services are based and where in the network those hosts might be placed. Of these three identifiers (application-level name, IP address, and MAC address), the first two must resolve to the next-lowest level identifier. Therefore, when an application on one host attempts to send a packet to an application on another host, it must resolve the application-level name into an IP address. When a packet reaches the network, it is sent over L2 to the destination host or next-hop router. In both cases, the IP address must resolve to a MAC address.

Question 3: How to organize Internet infrastructure?

The Internet is more than an unstructured collection of L2 networks connected by routers. Instead, it is a collection of autonomous systems (AS), also called domains, each containing a set of L2 networks managed by a single entity. Examples of ASs include enterprises, universities, and commercial Internet Service Providers (ISPs). These systems control their own internal L3 routing and must establish routing protocols with other systems to provide host-to-host transport between systems. Therefore, L3 involves two routing tasks: 

(i) Routing between networks within an AS (intra-domain routing) is handled by the router in the AS; 

(ii) Routing between ASs (inter-domain routing) is handled by so-called border routers connecting two or more ASs.

These two routing tasks have different requirements and therefore require different routing paradigms and protocols.

Question 4: How do these parts fit together?

The process of delivering a packet over the Internet begins with the application resolving an application-level name to an IP address and then calling the host's network support to send the data to that destination IP address. This results in a call to L4 which packages the data and a call to L3 to deliver these packets. At L3, a packet is forwarded through a series of routers until it reaches its destination network (identified by the destination IP address in the packet header). Each router has a forwarding table that maps the destination IP address to the IP address of the next-hop router. After receiving the packet, the router looks for a suitable next-hop router in the forwarding table and then sends the packet to that next-hop node by calling L2. L2 must first resolve the next-hop router's IP address into a MAC address and then deliver the packet to the next-hop router (either via broadcast or by forwarding the packet through a series of switches, as described in the next section), and then by the next-hop router. The hop router returns the packet to L3. The key technical challenge in this process is setting up the L3 forwarding table so that the set of next hops always results in packets reaching the appropriate destination.

2a145978aa9404dfaacdc49d76f86ae3.jpeg

3. Three core mechanisms of the Internet

Internet architecture identifies three important mechanisms necessary to implement the Internet architecture—routing, reliability, and name resolution schemes.

3.1 Routing

The term "routing" refers to the general problem of forwarding packets over the Internet to a destination host, which occurs at L3 and is implemented by routers, or at L2 by switches, and the implementation on L2 is called switching rather than routing. Starting by considering L3 intra-domain routing, assume: 

  1. Each L3 header contains a destination IP address,

  2. Each router has a set of neighboring routers that it connects at L2

  3. Each router has a forwarding table that correctly indicates whether the router is connected (at L2) to the packet's destination network, and if not, deterministically maps the arriving packet's destination to the next neighbor to which the packet should be forwarded. One hop router.

To simplify the problem, focus first on the situation where a set of static forwarding tables successfully guides the packet to its destination. The set of interconnections between routers is called the network topology graph, assuming they are connected; the set of forwarding tables is called the forwarding state; the union of these tables is called the routing state. A given routing state instance is valid if it always directs packets to their destination; if a ring exists (i.e., a starting location and a destination address), packets can return to the location it has visited router. Therefore, a routing state instance is valid if and only if there are no loops.

Assume there are no loops; because the network is connected and finite, any packet must eventually reach a router connected (at L2) to its destination. Assume that a loop exists for a given destination; any packet addressed to that destination entering the loop will never reach the destination.

Why care about this obvious result? Because the Internet uses various routing algorithms to calculate forwarding states, the key conceptual difference between these routing algorithms is the way they avoid loops in steady state. The routing protocol is a complex distributed system that solves the problem of how to quickly and automatically recalculate the forwarding status when the network topology changes due to network link failure and recovery. Here, we avoid the complexity of these distributed protocols and focus only on the forwarding state produced when the algorithm converges to a stable state on a static network. In these algorithms, the method to avoid steady-state loops mainly depends on what kind of information is shared between routers.

For example, a router knows its neighboring routers and can share this local information with other routers using a flooding algorithm. In steady state, each router can use this information to assemble the entire network topology map. If all routers use the same loop-free path finding algorithm to calculate their forwarding tables on this shared network topology graph, the resulting routing state is always valid. This approach is called "link state" routing.

Many in the computer science community believe that the Internet is simply a collection of complex protocols rather than a conceptually simple and bold design.

Another scenario is that each router informs its neighboring routers of its distance from all other networks based on some metric (such as latency). Router α can calculate its distance to each destination network n as dα (n) = minβ [ dβ (n) + d (α, β)] , where dβ (n) is the distance from each neighboring router β to the network The distance of n,d(α,β) is the distance α,β between two routers. This steady state of distributed computation produces a shortest path route to each destination, which cannot have cycles. This is called "distance vector" routing.

When the network topology changes, temporary loops in the routing state may occur during the recalculation of distance vector routing and link state routing (that is, when the protocol has not yet converged to a stable state). If left unchecked, packets circulating endlessly in such a loop can cause severe congestion control. To prevent this, the IP protocol wisely includes a field in the packet header that starts with an initial number set by the sending host and then decrements that number each time the packet reaches a new router. If this field reaches zero, the packet is dropped, thereby limiting the number of times a packet can traverse the temporary loop, ensuring that a catastrophic situation with the temporary loop does not occur. Without such a simple mechanism, all routing protocols would need to troubleshoot temporary loops, and this level of concern could complicate routing protocols.

Consider L3 inter-domain routing. ASs must obviously carry all packets for which they are a source or destination; all other packets are considered transport traffic, passing through other ASs on their way from the source AS to the destination AS. ASs want to be able to freely choose the traffic they carry, and the routes they use to divert traffic to their destinations. For ISPs, these two strategic choices depend heavily on their economic considerations with neighboring autonomous ISPs, so they want to keep these choices secret to avoid disclosures that could help competitors. Information.

Although inter-domain routing is actually implemented by border routers, which involves some complex intra-domain coordination between them, the inter-domain routing model can be simply established to form a graph of interconnected ASs and be controlled by the ASs themselves. Make routing decisions. Inter-domain routing must: 

  1. AS is allowed to make arbitrary policy choices, so distance vectors cannot be used

  2. Keep these policy choices private, so link-state routing cannot be used

This would require ASs to make their policies explicit so that every other AS can compute routes using these policies. Another approach is to allow AS systems to advertise their routes by choosing who they advertise to (by sending messages to neighboring AS systems saying, "You can use my path to reach this destination"), and when several neighbors have already sent them When routing to a given destination, these routes are chosen to implement their policies. These are the same messages and choices used in distance vector routing, but distance vector does not allow for policy flexibility: the route is advertised to all neighbors and only the shortest path is chosen. This local freedom provides policy flexibility and privacy, but how to prevent steady-state loops in calculating routes? The solution for the Internet is to exchange path information. When AS "A" advertises a path to neighboring AS "B" (for a specific destination), it specifies the entire AS-level path for that traffic to the destination. If AS "B" sees that it's already on that path, it won't use that path. If all ASs obey this apparent constraint, then the steady state of what we call "path vector" routing will be loop-free regardless of the policy choices made by the ASs. Path vector routing is used in the current inter-domain routing protocol BGP, and therefore, BGP is the glue that holds many automated systems on the Internet together.

Due to low performance requirements, the Internet is deployed at a high speed and scale.

This path vector approach ensures that there are no loops in any steady state. However, this does not ensure that the routing protocol converges to a stable state, for example, policy oscillation is the case where the path vector algorithm does not converge. It also does not ensure that all resulting stable states provide connectivity between all endpoints, since all ASs can deny transit connectivity to a given AS. It is puzzling why these anomalies are not observed on the Internet. Theoretical analysis shows that typical operational practices (routing maximizes revenue and minimizes costs) produce routing strategies that will always converge to a stable state, providing All endpoints between end-to-end connections.

Loop avoidance also plays a role in non-broadcast networks, where flooding is often used to reach the destination. Spanning-tree protocol (STP) creates a tree by choosing not to use certain links outside the network, that is, eliminating all rings from the network topology diagram. Once the network becomes a spanning tree, packets can be flooded to all hosts by having each switch forward the packet on all adjacent links in the spanning tree except the link on which the packet arrived. This flooding allows hosts and routers to resolve IP addresses to MAC addresses via an Address Resolution Protocol (ARP) message that asks, "Which host or router has this IP address?"; the owner then responds with its MAC address . During this ARP exchange (actually every time a host sends a packet), the switches can learn how to reach a specific host without flooding by remembering the link from which they most recently received a packet from that host. There is only one path between any two nodes on the spanning tree, so a host can be reached via the link through which packets sent from that host arrive. Therefore, using such a "learning switch", the act of resolving an IP address into a MAC address determines the forwarding state between the sending and receiving hosts. When sending a packet to a host whose MAC address has been resolved, the network does not need to use flooding and can instead send the packet directly.

In summary, all routing algorithms must have a mechanism to avoid loop formation in the stable state (that is, after the protocol has converged), which in turn depends on the information exchanged. In order to limit the exchange of information with adjacent routers in the AS, distance vectors need to be used to ensure no loops and thus generate the shortest path. To increase routing flexibility for intra-domain routing, a better option is link state, which requires flooded neighbor information but allows arbitrary loop-free path computation. To allow ASs to enforce individual policy controls in inter-domain routing, they can exchange explicit path information to avoid loops. In order to implement dynamic flooding and route learning in L2, it is necessary to convert network topology graphs into spanning trees, since they are inherently acyclic. Other issues might be considered in routing, such as how to recover from failures without having to recompute routes, and how to use centralized control to simplify routing protocols such as SDN, but the point here is to illustrate how to avoid loops in commonly used routing paradigms. The role of roads.

b3d8076e9b55adef7ca0a3f6252e57ff.jpeg

3.2 Reliable transmission

When discussing routing analysis, even with valid routing status, packets may still be dropped due to overloaded links or faulty routers. The Internet architecture does not guarantee the reliability of the first three layers, but wisely leaves this task to the transport layer or the application itself. Lost packets are only transmitted when they are retransmitted by the sending host.

Reliable transmission is ensured by the transport protocol, which establishes the connection to obtain data from one application and transfer it to a remote application. Some important transport protocols, such as the widely used TCP, provide a reliable byte stream abstraction, in which the data in the byte stream is divided into packets and transmitted sequentially, and all packet losses are recovered by the transport protocol itself. A reliable byte stream abstraction can be implemented entirely by host software that can distinguish packets in one byte stream from packets in another byte stream by having sequence numbers on the packets so that they can be used in any data delivery to The receiving application is correctly reordered and retransmits packets until they are successfully delivered.

Informally, a reliable transport protocol takes data from an application, transmits it to the destination in the form of packets, and eventually notifies the application that the transfer completed successfully or terminated with failure, in both cases stopping further transfers . Assume that the underlying network eventually delivers a duplicated packet, so the persistence protocol always succeeds. For this case, what communication is required between the sender and receiver to ensure that the protocol can notify the application that it has succeeded if and only if all packets have been received? There are two common approaches: the receiver can send an acknowledgment (ACK) to the sender when a packet is received, or a non-acknowledgement (NACK) when a packet is suspected of being lost.

ACK is necessary and sufficient for reliable transmission, while NACK is neither necessary nor sufficient. A reliable transport protocol can only declare success when it knows that all packets have been sent, which can only be inferred by receiving an ACK for each packet. The absence of a NACK (which itself can be removed) does not mean that the packet has been received. However, NACKs can be useful because they can provide timely information about when the sender should retransmit. For example, TCP uses explicit ACKs to improve reliability and initiates retransmissions based on timeouts and implicit NACKs (when the expected ACK does not arrive).

da9bf28aae455944d8b7c3206b52fa10.jpeg

3.3 Name resolution

In addition to resolving IP addresses to MAC addresses via ARP, the Internet must also resolve application-level names to one or more IP addresses. These names are informally called hostnames and formally known as fully qualified domain names. You can use non-standard terminology for application-level names that neither refer to a specific physical machine (as a MAC address does) nor are directly related to the domain concept used in inter-domain routing.

Any application-level naming system must: 

  1. Assign administrative control of each name to a unique permission that determines which IP address the name will resolve to; 

  2. Handle high-resolution requests; 

  3. Both properties are provided at the scale of billions of application-level names.

To address these challenges, the Internet adopts a hierarchical naming structure called the Domain Name System (DNS). The namespace is divided into areas called domains, which are recursively subdivided into smaller domains, and both resolution and management control are done hierarchically. Each named domain has one or more name servers that can create new subdomains and resolve names within its domain. The name can be fully resolved to one or more IP addresses), or the resolution can be directed to one or more subdomain name servers that can further resolve such names. This hierarchy starts with a set of top-level domains (TLDs), and commercial registration allows customers to register subdomains under these TLDs. Resolution of a TLD to its name servers is handled by a set of DNS root servers (whose addresses are known to all hosts), and resolution proceeds along the naming hierarchy from there. For example, www.myschool.edu is first resolved by a root server, which points to the edu name server, which then points to a name server for myschool.edu, which then resolves www.myschool.edu to an IP address. This hierarchy allows for highly parallel name resolution and fully distributed administrative control, both of which are critical to handling the scale of Internet naming.

aea3d363744962e95055fbb1a49f01d9.jpeg

4. The Secret to Internet Success

This article attempts to reduce the incomprehensible complexity of the Internet into a small set of design choices: 

  1. service model

  2. Four-tier architecture

  3. Three key mechanisms (routing, reliability and resolution)

Understanding the reasons behind these decisions is not enough to understand the complexity of today's Internet, but it is enough to design a new Internet with roughly the same properties. Unfortunately, this simplicity is not enough to explain the longevity of the Internet. Why has the design of the Internet been so successful in handling huge changes in speed, scale, scope, technology, and usage?

4.1 Rustic

Instead of trying to meet every possible application requirement, the Internet has adopted a very limited but very common service model, which has no guarantees. So for smart devices and non-smart networks: 

  1. Allows new applications to flourish because the Internet is not customized to any specific application needs;

  2. Leverage the host's ability to adapt to the vagaries of optimal Internet service in various ways (e.g., through rate adaptation, buffering and retransmission); 

  3. Able to quickly increase network speed, the service model is relatively simple.

If the Internet adopted a more complex services model, it would likely limit itself to the application requirements that existed at the time of its creation and could be implemented with the technologies available at the time. This results in a technology that is complexly designed for a small number of applications and quickly becomes obsolete, which is a recipe for short-term success but long-term failure.

4.2 Modularization

The modularity of the four-layer Internet architecture leads to a clear division of responsibilities: the network infrastructure (L1/L2/L3) supports better packet delivery (in terms of capacity, coverage and resiliency), while the applications (assisted by L4) Create new functionality for users based on this packet delivery service model. This architecture therefore allows two distinct ecosystems, network infrastructure and Internet applications, to flourish independently.

However, the Internet's modularity transcends its formal architecture to maximize a more general approach to autonomy within its standards-driven infrastructure, in contrast to the more rigid conformance of the telephone network. For example, the only requirement for an AS is that its routers support IP and participate in the interdomain routing protocol BGP. Otherwise, each system can deploy any L1 and L2 technology and any intra-domain routing protocol without coordination with other systems. Similarly, individual name domains must support the DNS protocol, otherwise they can employ whatever name management strategy and name resolution infrastructure they choose. The autonomy of this infrastructure allows for the emergence of different operational practices. For example, a university campus network, a hyperscale data center network, and an ISP backbone network have very different operational needs; the inherent autonomy of the Internet allows them to meet their own needs in their own way, and over time. develop.

4.3 Failure is common

As the size of a system increases, there is an increasing likelihood that some component of the system will fail at any given time. Therefore, scaling in terms of algorithmic complexity or state explosion is often thought of, while handling failures efficiently is also a key scalability requirement. Unlike systems that are expected to operate normally and enter special modes to recover from failures, nearly all Internet mechanisms treat failures as common events.

For example, in basic routing algorithms, routes are calculated in the same way whether due to link failure or recalculation of a loop. Similarly, when a packet is lost, it must be retransmitted, but such retransmissions are expected to occur frequently and are not special cases in the transport protocol. This design style, which treats failure as a common case, is the basis for building hyperscale infrastructure at Google and others, a style originally pioneered on the Internet.

4.4 Rough consensus and runnable code

Rather than employing the formal design committees that were popular in the telecommunications world at the time, the inventors of the Internet clearly chose another path: Encourage smaller groups to build workable designs, and then let a community choose which design to adopt. David Clark, one of the leaders of Internet architecture, said in a speech: "We reject kings, presidents, and votes. We believe in rough consensus and working code." This egalitarian spirit extends to the Internet as a connection A shared vision of a unified communication platform for all users. The development of the Internet was thus shaped not only by purely technical decisions, but also by the early Internet community's belief in the value of a shared platform to connect the world and their shared ownership in realizing this vision.

fd3ce79423ce6d39a60a3d22788872bf.jpeg

5. There is no such thing as perfection

There are many areas where the design of the Internet is a suboptimal solution, but most of these are low-level details that do not change the high-level representation. There are three areas in Internet design where more fundamental issues exist.

5.1 Security

Many people blame the poor state of Internet security on security, arguing that security was not a primary consideration in its design, although the ability to handle failure in the face of failure is indeed an important consideration. This criticism is misplaced for two reasons: 

(1) In an interconnected world, security is a more complex and elusive goal than cybersecurity, and the Internet can only ensure the latter; 

(2) Although the Internet architecture itself does not provide network security, there are protocols and technologies, some of which are in widespread use, that can achieve these security to a large extent.

More precisely, a network can be said to be secure if the following properties can be ensured when transmitting data between two hosts:

  1. The connection between hosts is reasonably reliable (availability); 

  2. The recipient can tell the source of the data;

  3. Data integrity has not been tampered with during transmission); 

  4. The data is not read by any middleman, no one snoops on a link and knows which host is exchanging packets (privacy).

The latter three can generally be guaranteed through encryption protocols. Internet availability is vulnerable to distributed denial-of-service (DDoS) attacks, in which many hosts are used as bots to overload the target with traffic. There are techniques to filter out this offensive traffic, but they are only partially effective. Availability is also threatened by BGP being vulnerable because the AS's routing is vulnerable; encryption solutions are available, but they are not widely deployed. Therefore, there are encryption protocols and mitigation methods that allow the Internet to meet the definition of a secure network to a large extent.

However, the failure to make the Internet an inherently secure network is definitely not the primary cause of current security problems. Secure networks cannot prevent host software insecurities. For example, if an application is designed to leak information about its users, or impersonate its users to perform unnecessary financial transactions, there is little the network can do to stop this malicious behavior. A host can become infected if a user inadvertently downloads a malicious application, or if some attack (such as a buffer overflow) turns a benign application into a malicious one. Criticisms about internet security often point to the ease of placing malware on hosts, but this is not primarily a cybersecurity issue.

5.2 Intra-network packet processing

Early Internet designers believed that network infrastructure should generally focus on packet transmission and avoid higher-level functions, but nearly every network in operation today violates the rules of so-called intra-network packet processing. Modern networks have many middle ground that provide more than just transport, such as firewalling (controlling which packets can enter the network), WAN optimization (improving network efficiency by caching previous data), and load balancing (directing requests to underload host).

A 2012 survey showed that about one-third of network components are middleware, one-third routers, and one-third switches. Some of these in-network processing functions are deployed within individual enterprises to improve their internal efficiency; this behavior only has an impact within the enterprise network and is therefore widely regarded as an acceptable reality. Additionally, as discussed below, some cloud and content providers have deployed large private networks, deploying in-network capabilities (notably throttling, caching, and load balancing) to reduce latency and improve customer-visible reliability.

97ffb48b84a8ab3d09f27bcdec226321.jpeg

5.3 Lack of evolutionary ability

Because every router implements the IP protocol, it is difficult to change the Internet's service model. The same service model has also been retained during the recent accelerated transition to IPv6 due to a shortage of IPv4 addresses. For decades, there have been no major architectural changes to the Internet, and there is no feasible alternative to the Internet. However, large private networks deployed in centrally controlled data centers provide superior service to customers through their in-network processing. Most client traffic today either caches its original AS service through the local control center or uses it directly from the source AS to have A large private network of processing centers within its private network. Large private networks from central control centers are vertically integrated with cloud and content services, which are now more dominant economic forces than ISPs.

Therefore, we have begun to see a transition to a new global infrastructure. These private networks have replaced the last mile of almost all traffic on the current Internet. This will mark the end of an era of Internet economics, when it had no real competitors, While the Internet faces many challenges, none are more important for its future than resolving the differences between the two visions that animated the early Internet communities: a unified Internet connecting all users. , essentially disconnected from the services provided by the endpoints; an alternative vision is represented by emerging large-scale private networks where the use of in-network processing provides better performance.

6. One sentence summary

The Internet is an engineering marvel that embodies very bold and prescient design decisions. Don’t let the complexity of the Internet overshadow the simplicity of its core design and achievements, and don’t forget the courage, community spirit, and vision that led to its creation. This is perhaps one of the more valuable aspects of the Internet.

176c7b3ab656ea341684954c0fa9a52e.jpeg

PS: So far, among the books about computer networks that veteran coders have read, this book "Computer Network Problems and Solutions" is the book that explains computer networks most clearly. It can be used not only as an IT engineer, but especially as a network engineer. An engineer's desk manual is also very helpful for system architects and even software engineers to understand the impact of network architecture and infrastructure on software systems. It will not depreciate in value within 20 years, allowing us to clearly understand and apply the computer networks around us. .

[Related reading]

Guess you like

Origin blog.csdn.net/wireless_com/article/details/132529943