Interview questions⑤

1. The difference between TCP and UDP

UDP TCP
Is it connected No connection, instant transmission Connection-oriented, three-way handshake
Is it reliable Unreliable transmission, network fluctuations and congestion will not slow down transmission Reliable transmission, using flow control and congestion control
Number of connection objects Support one-to-one, one-to-many, many-to-one and many-to-many interactive communication Only one-to-one communication
transfer method Packet-oriented, out-of-order packet loss may occur Oriented to byte streams, ensuring reliable transmission order
head overhead The header overhead is small, only 8 bytes The first part is 20 bytes small and 60 bytes large
Scenes Suitable for real-time applications (IP telephony, video conferencing, live broadcast, etc.) Suitable for applications requiring reliable transfers, such as file transfers

1.1 TCP/IP network model

        To communicate with each other, computers need to specify unified rules: protocol

        TCP/IP is the general term for various Internet protocols, including: TCP, UDP, IP, ICMP, SMTP, HTTP, etc.

        The protocol can be divided into four layers, link layer, network layer, transport layer, application layer

Link layer: responsible for encapsulating and decapsulating IP packets, sending and receiving ARP/RARP packets

Network layer: responsible for routing and sending packets to the target network or host

Transport layer: responsible for grouping and reassembling messages, and encapsulating messages in TCP or UDP protocol format

Application layer: responsible for providing applications to users

OSI seven layer model TCP/IP Conceptual Model Function TCP/IP protocol family
application layer application layer File Transfer, Email, File Services, Virtual Terminal TFTP/HTTP/SNMP/FTP/SMTP/DNS/Telnet
presentation layer Data formatting, transcoding, data encryption /
session layer Disconnect or establish connection with other nodes /
transport layer transport layer Provides an end-to-end interface TCP/UDP
Network layer Network layer Routing for packets IP/ICMP/RIP/OSPF/BGP/IGMP
data link layer link layer Transmission of addressed frame and error detection function SLIP/CSLIP/PPP/ARP/RARP/MTU
physical layer Transfer data on physical media as binary data ISO2110/IEEE803/IEEE802.2

         The establishment of communication in the network architecture is carried out at the same layer of the two parties in communication. When the data passes through each layer at the sending end, the protocol header and protocol tail of the corresponding layer must be attached (only the data link layer needs to encapsulate the protocol tail)

1.2 UDP (User Datagram Protocol)

       In the network, UDP is used to process data packets like TCP protocol. It is a connectionless and unreliable transport layer protocol. UDP does not provide data packet grouping and assembly, and cannot sort data packets, and cannot guarantee data security (unreliable : UDP at the sending end adds a UDP header to the data and passes it to the network layer; UDP at the receiving end removes the IP header and passes it to the application layer);

        UDP supports one-to-one, one-to-many, many-to-one, many-to-many transmission methods (unicast, multicast, broadcast), UDP will always send data at a constant speed regardless of network fluctuations, easy to lose packets, suitable for real-time requirements High scene (live broadcast, conference call)

        The amount of data in the UDP header is small, and the transmission datagram efficiency is high

1.3 TCP (Transmission Control Protocol)

        The TCP protocol is a connection-oriented, reliable, and byte-stream-based transport layer protocol. Before sending data, the two parties must propose a connection (the client and the server each save a copy of the other party’s ip, port, and other information). Three-way handshake and four-way handshake can ensure the complete and sequential transmission of data

1.3.1 TCP three-way handshake and four-way handshake

        The essence of the three-way handshake is to first confirm the ability of the two parties to send and receive data. A sends a message to B. After B receives it, he knows that A’s ability to send and receive can match, and then B sends the confirmation message back to A. , A knows that B's sending and receiving ability is good, and his own sending and receiving ability is good, and then A sends feedback to B, and B knows that his sending ability is good, and A's receiving ability is good, and the two parties can officially start communication

First handshake:

        The client initiates a connection request to the server, the client randomly generates a sequence number ISN (x), the message segment sent by the client to the server contains the SYN flag (SYN=1), and the sequence number seq=x

Second handshake:

        The server receives the message sent by the client, finds that SYN=1, knows that it is a connection request, saves the serial number seq of the client, randomly generates a serial number (y) of the server, and then replies a message to the client Text, including SYN and ACK flags (SYN=1, ACK=1), serial number seq=y, confirmation number ack=x+1

The third handshake:

        After receiving the reply from the server, the client finds that ACK=1, and ack=x+1, knowing that the server has received the message with the serial number x, and also finds SYN=1, knowing that the server has agreed to this time connection, so the seq of the server is saved, and then the client replies a message to the server, including the ACK flag (ACK=1), ack=y+1, seq=x+1 (the first handshake occupies a The serial number, so add one this time, the ACK message that does not carry data does not occupy the serial number, so the seq of the formally sent data later is still x+1), when the server receives the message, it finds that ACK=1 and ack= y+1, you know that the customer service end has received the message with the serial number y, so the customer service end server establishes a connection through TCP

         The purpose of four waves is to close a connection

First wave:

        When the client's data transmission is complete, the client sends a connection release message (it can be sent even if the transmission is not completed), including the FIN flag (FIN=1), the sequence number seq=u (u=x+1+the word of the sent data number of nodes), the client cannot send data but only receive data after sending the FIN message; (the data carried in the FIN message also occupies a serial number)

Second wave:

        After receiving the FIN message sent by the client, the server replies to the client with an acknowledgment message, including the ACK flag (ACK=1), confirmation number ack=u+1, sequence number seq=v (v=y+1+ The number of data bytes replied by the server before sending the client FIN message), at this time, the server is in the shutdown waiting state, and will not immediately send the FIN message to the client, but wait for the data to be sent

Third wave:

        After the server sends the final data, it sends a connection release message to the client, including FIN and ACK flags (FIN=1, ACK=1), confirmation number ack=u+1 (consistent with the second wave), Serial number seq=w (w=v+ the number of bytes of data sent last)

Fourth wave:

       After receiving the FIN message sent by the server, the client sends a confirmation message to the server, including the ACK flag (ACK=1), confirmation number ack=w+1, sequence number seq=u+1, and the client sends After confirming the message, the TCP connection will not be released immediately, but the TCP connection will be released after 2MSL (twice the life of the longest message segment). Once the server receives the confirmation message sent by the client, it will release it immediately TCP connection

2. What are forward proxy and reverse proxy

Forward proxy: refers to a server (proxy server) between the client and the target server. In order to obtain content from the target server, the client sends a request to the proxy server and specifies the target, and then the proxy server forwards the request to the target server and The obtained content is returned to the client, so that the target server does not know who the real client is

Forward proxy function: break through access restrictions, improve access speed (usually proxy servers will set up a large hard disk buffer, and will save part of the requested response to the buffer), hide the real IP of the client (from attacks)

Reverse proxy: refers to the proxy server to receive the client's connection request, then forward the request to the server on the internal network, and return the result obtained from the server to the client requesting the connection, so that the client does not know the real target server who is it

Reverse proxy function: hide the real IP of the server, load balancing (distribute client requests to different real servers according to the load of the real server), improve access speed (the reverse proxy server can be used for static content and in a short period of time) Provide caching services for dynamic content with a large number of access requests), provide security (the reverse proxy server can be used as an application firewall, provide protection for websites based on Web attack behavior, make it easier to troubleshoot malware, and provide unified encryption for back-end servers. HTTP access authentication, etc.)

3. JVM runtime data area

Program counter:

        It is the line number indicator of the bytecode executed by the current thread, and the memory space is private to the small thread. The job of the bytecode parser is to select the next bytecode instruction to be executed by changing the value of this counter, branch, loop , Jump, exception handling, thread recovery and other functions need to rely on this counter to complete;

        When a thread executes a Java method, the counter records the address of the bytecode instruction of the virtual machine being executed. When the thread executes a Native method, the value of the counter is Undefined;

        This memory region is the only one where the OutOfMemoryError condition is not specified in the JVM specification.

Virtual machine stack:

        The thread is private, and the life cycle is consistent with the thread. It is used to store information such as local variable tables, operand stacks, dynamic links, and method exits (local variable tables: store basic data types and object references known to the compiler);

        If the requested stack depth is greater than the maximum available depth, a stackOverflowError will be thrown. If the stack is dynamically expandable, but there is no memory space to support the expansion, an OutofMemoryError will be thrown.

Native method stack:

        The function of the virtual machine stack is consistent with that of the virtual machine stack, except that the virtual machine stack serves the Java method, while the local method stack serves for the virtual machine to call the Native method;

heap:

        The largest piece of memory in the Java virtual machine is shared by all threads. It mainly stores object instances and arrays. It internally divides multiple thread-private allocation buffers TLAB, which can be located in physically discontinuous but logically continuous spaces; if the heap space Insufficient instance allocation, OutOfMemoryError

Method area:

        Belongs to the shared memory area and is used to store class information, constants, static variables, and just-in-time compiled code that have been loaded by the virtual machine;

4. How does the JVM determine whether an object is recyclable 

Reference counting method:

        Add a reference counter to the object, add one to the count when adding a reference, and decrement the count by one when the reference is released. When the count reaches zero, it can be recycled. It is difficult to solve the problem of object circular reference (objects refer to each other, even if they are not used by the program, they cannot be recycled)

Accessibility analysis:

        Start searching downwards from GC Roots (garbage collection root node, indicating that a group of objects are active objects that cannot be recycled). The path traveled by the search is called a reference chain. When an object does not have any reference chain connected to GC Roots, it proves that this object unavailable

        Objects judged to be unreachable in the reachability analysis will be marked for the first time and screened once. The screening condition is whether the object must execute the finalize method. When the object does not override the finalize method or the finalize method has been called by the virtual machine , it is considered unnecessary to implement

        When it is considered necessary to execute the finalize method, the object will be put into a queue called F-Queue to wait for execution. Before the finalize method is executed, the GC will mark the objects in the F-Queue for a second small scale. Temporary objects are not reclaimed if they are associated with any object on the reference chain

5. What are the garbage collection algorithms of the JVM?

Mark-Clear Algorithm: First mark all objects that need to be reclaimed, and recycle all marked objects uniformly after the marking is completed.
Copy Algorithm: Divide the available memory into two equal parts according to capacity, and only use one of them each time. When a part is used When it is finished, copy the surviving objects to another block, and then clear the used up memory space. Mark-
compression algorithm: first mark all the objects that need to be recycled, let all unmarked objects move to one end, and then clean up Space
generational collection algorithm beyond this end : Divide the Java heap into the new generation and the old generation, and adopt the most appropriate collection algorithm according to the characteristics of each age. When a large number of objects die every time the new generation is garbage collected, the copy algorithm should be used. The old generation Medium objects with high survival rate should use mark-sweep algorithm

6. JVM memory allocation strategy and recovery strategy

  1.  Objects are first allocated in the Eden area of ​​the heap
  2. Large objects go directly to the old generation
  3. Long-lived objects go directly to the old generation

For efficient garbage collection, the virtual machine divides the heap memory into three areas: the new generation, the old generation and the permanent generation

new generation:

        The new generation is composed of Eden and Survivor (the default size ratio is 8:1). Eden is the largest area in the new generation and is used to store newly created objects. When the Eden space is full, a Minor GC (new generation garbage collection) will be triggered. Recycle the objects that are no longer referenced, copy the surviving objects to the Survivor space, and then clear Eden

        The Survivor space is usually divided into two equal-sized areas, usually called S0 and S1. When a Minor GC occurs, Eden’s surviving objects are copied to one of the spaces S0, and the other space S1 is used for garbage collection. When Minor GC occurs again , the surviving objects in Eden and S0 are moved to the S1 area, so that they are constantly exchanged, and the GC age of the object starts to accumulate from the time it is moved to Survivor. When the GC age of the object exceeds the default threshold of 15, the object will be moved to the old Era

Old generation:

        Major GC/Full GC will be triggered when there is insufficient space in the old generation, and the speed is more than 10 times slower than Minor GC. Major GC will perform garbage collection on the new generation, the old generation and the permanent generation (or meta space)

Permanent generation: (Java8 and later use metaspace)

        Used to store metadata of classes (class structure information, method information, field information, constant pool, etc.), the size of the permanent generation needs to be allocated according to the class loading requirements of the application, because the size is usually limited and easy to overflow, the garbage of the permanent generation Recycling usually happens when the application is stopped

Metaspace:

        The metaspace is no longer a part of the heap memory, but an area in the local memory. The size is no longer limited by the size of the permanent generation. It can grow dynamically according to program requirements until the local memory limit. When the class loader no longer references a certain These classes and their associated metadata will be recycled when

7. Class life cycle

        The parsing phase can start after initialization (runtime binding, dynamic binding, late binding)

Load: JVM finds and loads the binary data of the class

  • Get the binary byte stream that defines this class by its fully qualified name (ZIP package, network, JSP generation, calculation generation, database reading)
  • Convert the static storage structure represented by this byte stream into the runtime data structure of the method area
  • Generate a java.lang.Class object representing this class in memory as the access entry for various data of this class in the method area

After the loading phase is completed, the binary byte stream is stored in the method area (permanent generation/metaspace) in the format required by the JVM

Verification: Ensure that the byte stream loaded from the class file meets the requirements of the current JVM and does not endanger the security of the virtual machine itself

File Format Validation:

  •  Whether to start with the magic number 0xCAFEBABE
  • Whether the major and minor version numbers are within the processing range of the current virtual machine
  • Whether the constant in the constant pool has a type that does not support constants (check the constant tag flag)
  • Does any of the various index values ​​pointing to constants point to non-existent constants or constants that do not conform to the type
  • Is there any data in the constant of CONSTANT_Utf-8_info that does not conform to UTF8 encoding?
  • Whether each partial set file in the Class file itself has additional information that is deleted

        Only after passing the verification of this stage, the byte stream will enter the method area of ​​the memory for storage. The next three verification stages are all based on the storage structure of the method area, and the byte stream is no longer directly manipulated.

Metadata validation:

  • Whether a class has a parent class (other than java.lang.object)
  • Whether the parent class of this class inherits a class that is not allowed to be inherited (final modified class)
  • If this class is not an abstract class, whether it implements all the methods required to be implemented in its parent or interface
  • Whether the fields and methods in the class conflict with the parent class (overriding the final field of the parent class, overloading that does not conform to the specification)

This stage is mainly to perform semantic verification on the metadata information of the class to ensure that there is no metadata information that does not conform to the Java specification

Bytecode verification:

  • Ensure that the data type of the operand stack and the instruction code sequence work together at any time (it will not appear to read an int type data according to the long type)
  • Ensure that jump instructions do not jump to bytecode instructions outside the method body
  • Ensure that the type conversion in the method body is valid (it is safe to assign a subclass object to the parent class data type, and vice versa)

        This is the most complicated stage in the verification process. The main purpose is to determine that the semantics are legal and logical through data flow and control flow analysis. In this stage, the method body of the class is verified and analyzed to ensure that the methods of the verified class are in the No events that endanger the security of the virtual machine will be made during runtime

Symbol reference verification:

  • Whether the fully qualified name described by the string in the symbol reference can find the corresponding class
  • Whether the class, field, method in the symbol reference can be accessed by the current class

Prepare:

        This stage allocates memory for the class and sets the initial value of the class variable to the default value. The memory is allocated in the method area (only static variables of the class are allocated, excluding instance variables)

Parse:

        The JVM replaces the symbolic references in the constant pool (a set of literals used to describe the target, that is, static placeholders) with direct references (pointers directly to the target in memory, relative offsets, or handles that indirectly locate the target ), the analysis is mainly for classes or interfaces, fields, class methods, method types, etc.

Initialization: At this stage, the Java code in the class is actually executed, and the static variables of the class are given the set initial value

Use: After the class is initialized, it can be used, including creating instances, calling methods of the class, accessing fields of the class, etc.

Uninstall:

        When a class is judged as a useless class (all instances of the class have been recovered, the class loader that loaded the class has been recovered, and the java.lang.Class object corresponding to the class is not referenced anywhere), it can be uninstalled

8. Common solutions for distributed transactions

       Distributed transactions refer to the transaction consistency problem of multiple cross-database transactions, or the transaction consistency problem between multiple transactions composed of multiple application nodes under a distributed architecture; the current mainstream distributed transaction solution There are two options:

  • One is a strong consistency transaction solution based on the XA protocol: XA transaction model 2PC in Atomikos and Seata (based on the CAP theory, we can know that if we want to ensure the strong consistency of distributed transactions, it will inevitably bring about performance impact. Therefore, the performance of strongly consistent transactions will be relatively low)
  • The other is a weakly consistent transaction solution based on BASE theory: TCC transaction model, eventual consistency scheme based on reliability messages, Seata's Saga transaction model (final consistency transaction loses strong consistency of data, through asynchronous The compensation method achieves the final consistency of data, so it has better performance and is more suitable for scenarios with high concurrency)

Two-phase commit (2PC): (based on XA protocol)

        2PC, that is, two-phase commit, splits the commit of a distributed transaction into two phases: prepare and commit/rollback, that is, the preparation phase and the commit execution phase. In the prepare preparation stage, it is necessary to wait for the feedback of all participating sub-transactions, which may cause the database resource locking time to be too long, which is not suitable for business scenarios with high concurrency and long life cycle of sub-transactions. And if the coordinator goes down, all participants will not receive commit or rollback instructions

Three-phase commit (3pc):

        The three- phase commits are: CanCommit, PreCommit and doCommit. 3PC uses the timeout mechanism to solve the synchronous blocking problem of 2PC, avoiding resources being permanently locked, and further enhancing the reliability of the entire transaction process. However, 3PC is also unable to cope with similar downtime problems, but the probability of data inconsistency in multiple data sources is small.

Transaction Compensation (TCC):

        TCC adopts a compensation mechanism, and its core idea is: for each operation, a corresponding confirmation must be registered

and compensation (undo) operations, divided into three phases: Try, Confirm, Cancel
  • try phase: try to execute, complete the consistency check of all businesses, and reserve necessary business resources.
  • Confirm stage: In this stage, the business is confirmed and submitted without any inspection, because the try stage has already been checked, and the Confirm stage will not make mistakes by default.
  • Cancel phase: If the business execution fails, enter this phase, it will release all the business , and roll back all the operations performed in the Confirm phase.

        The TCC scheme allows applications to customize the granularity of database operations, reducing lock conflicts and improving performance. However, the application is highly intrusive, and the three stages of try, confirm, and cancel require business logic implementation

Local message table (asynchronously ensured):

        The message producer needs to create an additional message table and record the message sending status. The message table and business data must be submitted in a transaction, that is to say, they must be in a database. Then the message will be sent to the consumer of the message through MQ. If the message fails to be sent, it will be retried.
        The message consumer needs to process the message and complete its own business logic. At this time, if the local transaction processing is successful, it indicates that the processing has been successful. If the processing fails, the execution will be retried. If it is a business failure, you can send a business compensation message to the producer to notify the producer to perform operations such as rollback.
        The producer and consumer regularly scan the local message table, and resend unprocessed messages or failed messages. If there is a reliable automatic reconciliation and replenishment logic, this solution is still very practical.
        This scheme follows the BASE theory and adopts the final consistency. Among these schemes, it is more suitable for actual business scenarios, that is, there will be no complex implementation like 2PC ( when the call chain is very long, the availability of 2PC is Very low ) , and there is no possibility of confirmation or rollback like TCC.
Advantages: A very classic implementation that avoids distributed transactions and achieves eventual consistency. There are ready-made solutions in .NET .
Disadvantage: The message table will be coupled to the business system. If there is no packaged solution, there will be a lot of chores to deal with.

MQ transaction message:

        Some third-party MQs support transactional messages, such as RocketMQ . The way they support transactional messages is similar to the two-phase commit adopted, but some mainstream MQs on the market do not support transactional messages, such as RabbitMQ and Kafka . not support.

        Taking Ali's RocketMQ middleware as an example, the idea is roughly as follows:
        In the first stage of the Prepared message, the address of the message will be obtained. The second stage executes local transactions, and the third stage uses the address obtained in the first stage to access messages and modify the state. That is to say, in the business method, you want to submit two requests to the message queue, one for sending a message and one for confirming a message. If the confirmation message fails to be sent, RocketMQ will periodically scan the transaction messages in the message cluster. When it finds a Prepared message at this time, it will confirm to the sender of the message, so the producer needs to implement a check interface, and RocketMQ will follow the policy set by the sender. Decide whether to rollback or continue sending acknowledgment messages. This ensures that message sending succeeds or fails at the same time as the local transaction.
         Unfortunately, RocketMQ does not have a .NET client.
Advantages: Achieves eventual consistency and does not need to rely on local database transactions.
Disadvantages: It is difficult to implement, mainstream MQ does not support, there is no .NET client, and some codes of RocketMQ transaction messages are not open source.

 Saga mode:

  • Saga is a distributed transaction model based on compensating operations, which splits long-running transactions into a series of small transactions (called Saga steps).
  • Each saga step has its own compensation operation, which undoes the previous operation.
  • If a step fails, the system performs reverse actions to roll back previous steps, or performs compensating actions to fix the problem.

        Suitable for complex, long-running transactions such as order processing, flight booking, etc.

        The Saga model provides greater flexibility, allowing the system to take appropriate action when an error occurs, rather than simply rolling back.

9. Shiro certification process

  • The system calls the login method of the subject body to submit the user information token to the SecurityManager object for authentication
  • SecurityManager delegates the authentication operation to the authenticator object Authenticator (authentication manager), and the authentication manager has an authentication strategy Authentication Strategy (authentication method)
  • Authenticator passes identity information to Realm (completes data loading and encapsulation).
  • Realm accesses the database to obtain user information, then encapsulates the information and returns it to the authentication manager Authenticator
  • Authenticator authenticates the information returned by the realm (comparing the information entered by the user with the information queried in the database).

10. SpringMVC processing request process

  1. When a user initiates a request, the request is first intercepted by the Servlet and forwarded to the Spring MVC framework
  2. The DispatcherServet core controller in Spring MVC will receive the request and forward it to the handler mapper HandlerMapping
  3. HandlerMapping is responsible for parsing the request and finding the matching Controller class according to the request information and configuration information. However, if there is an interceptor configured here, the preHandle method in the interceptor will be executed in order
  4. After finding the matching Controller, pass the request parameters to the method in the Controller
  5. After the method in the Controller is executed, a ModeAndView will be returned, which will include the view name and the model data that needs to be passed to the view
  6. The view resolver finds the view according to the name, then fills the data model into the view and renders it as Html content and returns it to the client

Guess you like

Origin blog.csdn.net/LB_bei/article/details/132608625