The interview abused me thousands of times, I treat the interview like first love (face-to-face sharing)

Too lazy to distinguish between companies! Digging deep with several questions usually dies at the last question! There are also some mentally handicapped people who didn't answer well and wrote it !   

1. Do you understand the starter of springboot?

In a Spring Boot application, a starter is a set of preconfigured dependencies that you include in your project via a dependency declaration. The purpose of the Spring Boot starter is to simplify the setup process for new projects by providing a common set of dependencies and configuration;

2. What is the difference between MySQL's B+ tree and hash tree

  1. Data distribution: The hash tree distributes the data according to the result of the hash function, while the B+ tree stores the data on the leaf nodes in order.

  2. Query efficiency: Hash trees have O(1) query efficiency when they have equality queries, but tend to be less efficient when it comes to range queries. Although the B+ tree is slightly less efficient than the hash tree when searching for a single record, it is more friendly when querying intervals, and can obtain the required data faster.

  3. Index maintenance: The hash tree needs to calculate the hash function, and the hash value needs to be recalculated when data is added or deleted. Therefore, for frequent data update operations, the maintenance cost of the hash tree is relatively high. The B+ tree only needs to make local adjustments to the tree when inserting and deleting operations, and the maintenance cost is low.

  4. Storage space: The hash tree usually needs to be expanded to ensure that the number of hash collisions is not too large, so a part of the storage space may be wasted. The B+ tree can flexibly split and merge nodes, and use storage space more efficiently;

3. Is md5 algorithm encryption symmetric or asymmetric?

         Introduction: MD5 algorithm encryption is a one-way hash function, which belongs to asymmetric encryption algorithm. It can convert a message of arbitrary length into a fixed-length (128-bit) output, often referred to as a digest or fingerprint. The MD5 algorithm is irreversible, that is, the original data cannot be reversed from the summary

        Scenario: During file transfer, you can check whether the file has been tampered with during transfer by calculating the MD5 value of the file. In addition, the MD5 algorithm can also be used to encrypt sensitive information such as stored passwords

4. How to solve the problem that the server waits for the response from the client to time out?

        Introduction: In a TCP connection, when the server enters the LAST_ACK state after sending a FIN message, it needs to wait for the client to send an ACK message to confirm disconnection. If the client does not respond in time, the server will wait forever, resulting in the failure to release connection resources, thus affecting the performance and reliability of the server

1. Adjust the TCP timeout time : the server can control the client response timeout time by adjusting the parameters of the TCP protocol stack. For example, the interval time of the TCP keepalive mechanism can be shortened so as to detect connection state abnormalities faster.

2. Set the maximum idle time of the TCP connection : the server can set the maximum idle time of the TCP connection. If the connection is inactive within this time, it will automatically close the connection. This can avoid the situation that the server has been waiting because the client does not respond.

3. Realize the heartbeat mechanism : the server can implement a heartbeat mechanism to periodically send heartbeat messages to the client to detect whether the connection is normal. If the client does not respond for a long time, the server can actively close the connection.

4. Forcibly close the connection : If the client does not respond for a long time, the server can also choose to forcibly close the connection to release connection resources. Although this method will lose untransmitted data, it can ensure that connection resources are released, thereby avoiding affecting server performance and reliability

5. Database index and multi-table query

A multi-table query generally includes the following steps:

  1. Determine the tables and fields that need to be queried, and the relationships between them.
  2. Use JOIN to connect different tables according to the relationship.
  3. Use the WHERE clause to filter qualified data.
  4. The query results can be sorted, grouped and other operations.

The following is a sample code display: Suppose we have two tables, one is the user information table user, and the other is the order information table order. They are associated with each other through user ID.

 
 

This SQL query first connects the user information table user and the order information table order using INNER JOIN, and establishes an association based on the user ID. Then use the WHERE clause to filter out records where the user name is 'Bob'. Finally return all the fields in the query result set

6. How does Mytabis prevent Sql injection

1. Mybatis prevents SQL injection through precompilation. It escapes the parameters entered by the user and then splices them into the SQL statement, instead of directly splicing the parameters into the SQL statement. This prevents malicious users from modifying the original SQL statement by entering special characters.

2. At the same time, Mybatis also supports the use of dynamic SQL statements, which enables dynamic splicing of SQL statements according to different situations when constructing SQL statements, making it more flexible and secure. For example, an if statement can be used to determine whether a parameter is empty, and if it is empty, the SQL statement corresponding to the parameter will not be spliced ​​to avoid unnecessary errors.

3. In general, Mybatis provides a variety of methods to prevent SQL injection attacks, including precompilation, escape parameters, dynamic SQL, etc. Developers only need to use Mybatis according to the specifications to effectively prevent SQL injection attacks

7. Are the leaf nodes of the B+ tree ordered?

Yes, the leaf nodes of the B+ tree are usually ordered .

1. In the B+ tree, all keywords are stored on leaf nodes, and non-leaf nodes only store index information. Since each non-leaf node stores a set of key ranges, they themselves do not need to be kept in order.

2. However, in order to support range queries, the leaf nodes of the B+ tree must be sorted in order of the size of the keywords. In this way, when it is necessary to search for data within a certain range, the required data can be quickly located by traversing the leaf nodes.

3. Therefore, the leaf nodes of the B+ tree are usually ordered. For insertion and deletion operations, although the order of leaf nodes will change, the B+ tree will maintain the order of leaf nodes through some adjustment operations (such as node splitting, merging, etc.)

8. How to optimize Mysql if a query is slow

1. Index optimization : Index is an important means to improve query efficiency. If the table in the query statement does not have a suitable index, it will cause MySQL to perform a full table scan, thereby affecting query performance. Therefore, query efficiency can be improved by creating appropriate indexes.

2. SQL statement optimization : The way SQL statements are written also affects query speed. Some common optimizations include avoiding functions in WHERE clauses, avoiding SELECT *, avoiding OR, etc.

3. Analyze and optimize the query plan : MySQL will generate an execution plan based on the SQL statement and decide how to execute the query operation. Therefore, you can analyze the query plan to find out where the query may be slow and optimize it.

4. Optimize the database structure : The table structure design of the database will also directly affect the query efficiency. For example, in the case of a large amount of data, you can divide the data table into multiple tables or use partition tables to reduce query time.

5. MySQL server parameter optimization : MySQL server itself also has many configuration parameters, which can be adjusted according to actual needs to improve query performance. For example, you can increase the cache size, adjust the number of concurrent connections, adjust the temporary table space, etc.

9. How does the SpringBoot project get beans when it starts?

  1. During the startup process, Spring Boot scans all@Componentclasses annotated by relevant annotations (such as @Service, @Controller, etc.) and generates corresponding Bean definitions.

  2. Spring Boot will register all generated bean definitions with the container.

  3. After registering all bean definitions, Spring Boot will perform dependency injection (DI), which is to bind the dependencies between beans. This process is carried out according to the dependency relationship between each Bean.

  4. When a bean needs to be obtained, Spring Boot will find the bean instance from the container and return it to the caller. The specific search method may be search by Bean name, search by type, and so on.

10. Do you know the principle of Spring Bean management?

  1. First , at startup, the Spring container will read all bean definitions based on configuration files or annotation scanning.

  2. Next, the Spring container creates instances of all beans and stores them in internal data structures.

  3. The Spring container will automatically inject each Bean according to the dependencies between Beans to form a complete Bean dependency.

  4. When the application needs a Bean, the Spring container will find and return the corresponding instance from the previously created Bean instance.

  5. When the application is closed, the Spring container will automatically destroy all created Bean instances and release resources.

11. Where is the class loaded?

In Java, after the class is loaded, it will be stored in the method area (Method Area). The method area is a memory area used to store data such as class information, constants, and static variables, and is part of the heap memory.

When the Java program is executed, the virtual machine loads the class to be used into the method area, and performs linking and initialization operations. Among them, the link includes three steps of verification, preparation and analysis, and the initialization is the process of executing the method of the class constructor <clinit>().

In the method area, each class has a unique Class object corresponding to it, which stores information such as attributes and methods of the class. At the same time, the method area also includes information such as runtime constant pools, various types of symbol references, and class loaders.

12. Jvm class loading process

    1. Load:

In the loading phase, the virtual machine finds and reads the binary data of the class through the class loader (Class Loader), and stores it in the method area. Class loaders load classes from different places in a specific order, such as the local file system, remote servers on the network, and so on. In general, class loaders can be divided into the following categories:

  • Bootstrap Class Loader (Bootstrap Class Loader): Used to load the Java core library, it is the class loader that comes with the virtual machine.
  • Extension Class Loader (Extension Class Loader): used to load the Java extension library.
  • Application Class Loader (Application Class Loader): Used to load classes on the application class path.
  • Custom class loaders: Developers can implement their own class loaders as needed.

    2. Links:

In the linking phase, the virtual machine verifies, prepares, and parses the binary data of the class to ensure that the class can be loaded and executed correctly.

  • Verification: check whether the binary data of the class conforms to the JVM specification, and ensure safety and integrity.
  • Preparation: Allocate memory for the static variables of the class, and set default values.
  • Resolution: Replace the symbolic references in the class with direct references so that subsequent access operations can be performed normally.

   3. Initialization:  

In the initialization phase, the virtual machine executes the <clinit>() method of the class, which includes the assignment of static variables of the class and the execution of static code blocks. If a class is not initialized, its static variables will not be allocated memory or set default values, and the static code block will not be executed.

In short, the class loading process of the JVM is to read and store the binary data of the class in the method area, then perform verification, preparation and analysis on it, and finally execute the <clinit>() method for initialization.

13. When will the index fail

  1. Uneven data distribution: If the data distribution in a table is uneven, and some values ​​appear frequently, while others appear infrequently, using an index may actually reduce query efficiency.

  2. Perform function operations on index columns: When performing function operations on index columns, such as using functions and type conversions, the index will become invalid. Because the index is built based on the original column value, the function operation will change the column value, so that the index cannot match the original value.

  3. Fuzzy query of non-prefix index: If the LIKE statement is used for fuzzy query without prefix matching, the index will also fail. Because in this case the entire index tree needs to be scanned, not just part of it.

  4. Multi-table association query: If you perform a multi-table association query and the association conditions are not indexed, the query speed will slow down.

  5. The amount of data in the database is too large: If the amount of data in a table in the database is particularly large, exceeding the amount of data that the index can handle, the query efficiency of the index will be greatly reduced

14. Why use Md5 encryption (I have only used Md5 encryption...)

  1. Compressibility: After the input data of any length is calculated by the hash algorithm, the output result is a fixed-length 128-bit binary number.

  2. Low collision probability: The output of the MD5 algorithm is very difficult to restore the original data through inverse operations, so the MD5 algorithm is widely used in scenarios such as data verification and digital signatures.

  3. Rapidity: The calculation speed of the MD5 algorithm is very fast and is suitable for processing large amounts of data.

15. Talk about the seven principles of design patterns (never noticed this thing in tm)

  1. Single Responsibility Principle (SRP): A class is only responsible for completing one responsibility or function, avoiding the problems of functional coupling and increased complexity.

  2. Open Closed Principle (OCP): Software entities should be open for extension and closed for modification. That is, without modifying the original code, new functions are realized by extending the existing code.

  3. Liskov Substitution Principle (LSP): The subclass must be able to replace the parent class and exhibit the expected behavior, that is, the subclass cannot destroy the functionality of the parent class.

  4. Dependency Inversion Principle (DIP): High-level modules should not depend on low-level modules, both should depend on abstractions; abstractions should not depend on concrete implementations, and concrete implementations should depend on abstractions.

  5. Interface Segregation Principle (ISP): The client should not be forced to rely on interfaces that it does not need, that is, the interface should be small and specialized to avoid "fat interfaces".

  6. Composition Reuse Principle (CRP): Try to use object composition and aggregation instead of inheritance to achieve the purpose of code reuse. Because the inheritance relationship has high coupling and poor flexibility.

  7. Dimit's Law (LoD): Also known as the principle of least knowledge, an object should know as little as possible about other objects, that is, only communicate with direct friends, and maintain low coupling and high cohesion during the design process .

16. Can you talk about zombie processes? (Answering is equivalent to not answering, I really don’t know much)

1. When a process terminates, the kernel reserves resources for a certain amount of time so that the parent process can get its exit status. This period of time is called the "zombie period" of the process. If the parent process does not obtain the exit status of the child process in time, the child process will become a "zombie process".

2. Zombie processes occupy limited resources in the system (such as process ID, program counter, etc.). If there are a large number of zombie processes, it will lead to waste of system resources and performance degradation.

3. The way to avoid zombie processes is to let the parent process handle the exit status of the child process in a timely manner. You can use functions such as wait() or waitpid() to wait for the exit of the child process and obtain its exit status. Register a handler for the SIGCHLD signal in the parent process. When the child process exits, the handler will be called, and the parent process can obtain the exit status of the child process in the signal handler.

4. In addition, you can also ignore the SIGCHLD signal of the child process by setting the signal processing method of the process to SIG_IGN, so that the kernel automatically reclaims the resources of the child process and avoids the generation of zombie processes.

17. How to communicate between processes, can you talk about their differences in detail (the communication method between processes does not have to be familiar?)

  1. Pipeline (Pipe): A pipe is a one-way communication mechanism that can communicate between parent and child processes. It can create a half-duplex pipe by calling the system function pipe(), and use the write end and the read end of the pipe to communicate in the parent and child processes respectively.

  2. Named Pipe (Named Pipe): Named pipe is also a one-way communication mechanism, which can communicate between different processes. Unlike pipes, named pipes have independent file names and file descriptors and can be created with the mkfifo() system call.

  3. Shared Memory: Shared memory is an efficient communication mechanism that allows multiple processes to directly access the same physical memory space, avoiding the overhead of data copying. The implementation of shared memory needs the help of APIs provided by the operating system, such as shmget(), shmat(), etc.

  4. Semaphore (Semaphore): A semaphore is a mechanism for process synchronization and mutual exclusion, which can ensure the order and quantity of multiple processes accessing shared resources. The realization of the semaphore needs to rely on the API provided by the operating system, such as semget (), semop (), and so on.

  5. Message Queue (Message Queue): Message Queue is a communication mechanism through message passing, which can communicate between different processes. The implementation of the message queue needs to rely on the API provided by the operating system, such as msgget(), msgsnd(), msgrcv(), etc.

18. HTTP is a stateless protocol. After the client logs in, the connection will be released after a request is completed, but it is hoped that the server can remember the client's login status. What can I do? (answered cookies and session)

  1. Cookie: The server sends a cookie information to the client through the Set-Cookie header. In subsequent requests, the client will automatically bring the information back to the server in the Cookie header. The server can use cookies to retain information such as the client's login status.

  2. Session: The server creates a session for each client, assigns it a unique SessionID, and puts the SessionID in a cookie and returns it to the client. In subsequent requests, the client will bring back the SessionID in the cookie to the server, and the server will find the corresponding session information accordingly.

  3. Token: The client submits authentication information such as user name and password, and the server generates a Token after passing the verification and returns it to the client. The client adds the Authorization field to the HTTP header in subsequent requests, and returns the Token to the server. Based on this, the server determines the identity of the client.

  4. URL rewriting: If the browser does not support cookies, it can be implemented through URL rewriting. The server will put the sessionid in the path part of the url, so that the sessionid can be passed through the URL

19. What is the difference between synchronized and lock

  1. Scope of use: synchronized can be used to modify methods and code blocks, while Lock can only be used for code blocks.

  2. Lock acquisition method: synchronized will automatically release the lock, and Lock must explicitly call the unlock() method to release the lock.

  3. Granularity: Before Java 5, the granularity of the synchronized lock object was for the entire method or code block, and Lock could implement more flexible locking, for example, it could be locked for a part of a data structure.

  4. Reentrancy: synchronized is a reentrant lock, that is, the same thread can acquire the same lock multiple times without being blocked. Lock is also a reentrant lock, but care needs to be taken to avoid deadlock problems.

  5. Performance: Before Java 5, the performance of synchronized was often worse than that of Lock, because it involved mutual exclusion semaphores at the operating system level. However, after Java 5, the performance of synchronized has been greatly optimized, and in most cases it is equivalent to or even better than Lock.

  6. Waiting can be interrupted: Lock provides the tryLock() method, which can try to acquire the lock and wait for the lock to be released within a specified time. At the same time, the lockInterruptibly() method is also provided, which can be interrupted during the waiting process

20. How to use ArrayList in a multi-threaded environment? (If you don't understand this problem, you can use related concurrent collection classes if the thread is not safe)

1. Use the Collections.synchronizedList() method to convert the ArrayList to a thread-safe List. This method returns a thread-safe wrapper that ensures that only one thread is in operation while accessing the collection

21. What kind of circulation process does an object go through from new to being recycled? (Not answered well)

  1. Creation phase: When using newthe operator to create an object, the JVM will allocate a continuous space in the heap memory to store the object and the member variables contained in the object.

  2. Initialization phase: After the memory space is allocated, the JVM will initialize this space. During this process, each member variable contained in the object will be assigned an initial value, for example, the member variable of int type is 0 by default, and the member variable of reference type is null by default.

  3. Use phase: When the object is initialized, it can be used. Objects can be passed to other methods or assigned to other variables, and applications can manipulate objects by calling methods on them. In this phase, objects are used and modified.

  4. Referenced phase: If the object is still referenced by other variables, the object remains referenced until all variables that refer to it become invalid.

  5. Garbage collection phase: If the object is no longer referenced, the JVM's garbage collector reclaims the memory occupied by the object. The garbage collector will run automatically at a certain point in time, detect which objects are no longer referenced, and release the memory space they occupy

22. What are the binary trees? What is the time complexity of binary search tree query? Worst case? (forgot to turn into a linked list)

1. A binary tree is a tree-shaped data structure in which each node has at most two child nodes, called the left child node and the right child node. Common binary trees include: ordinary binary tree (Binary Tree), full binary tree (Full Binary Tree), complete binary tree (Complete Binary Tree), etc.

2. Binary Search Tree (BST) is a special binary tree, the value of all nodes in its left subtree is less than the value of the root node, and the value of all nodes in the right subtree is greater than the root node value. This rule guarantees the efficiency of the binary search tree in search, insertion and deletion operations.

3. The time complexity of the search operation in the binary search tree is O(log n) , where n represents the number of nodes in the binary search tree. This is because each search will reduce the search interval by half, so the time complexity of the search is related to the height of the tree. In the worst case, that is, when the binary search tree degenerates into a linked list, the time complexity of the search is O(n).

4. It should be noted that when inserting and deleting operations, the binary search tree may be unbalanced, which will increase the height of the tree and affect the time complexity of the search. Therefore, in practical applications, it is necessary to use some special binary search trees, such as red-black trees, AVL trees, etc., to ensure the balance of binary search trees, thereby improving the efficiency of search operations

23. Talk about the spring container, how to understand IOC?

The size of the TCP header is 20 bytes and includes the following:

  1. Source port and destination port: used to uniquely identify the source address and destination address of the transmitted data.

  2. Serial number and confirmation number: used to achieve the reliability of data transmission. The sequence number indicates the number of the first byte sent in this transmission in the entire data stream, and the confirmation number indicates the sequence number of the next data packet expected to be received.

  3. Data Offset and Reserved Bits: Data Offset indicates the length of the TCP packet header (the unit is 32 bits). The reserved bits are reserved for future use and must be set to 0.

  4. Control bits: including ACK, SYN, FIN, RST, PSH and URG six flag bits, used to control TCP connection and data transmission.

  5. Window size: used for flow control, indicating the maximum amount of data that the receiver can receive.

  6. Checksum: Used to check whether data errors occur during transmission.

  7. Urgent pointer: used to process TCP urgent data, indicating the end position of urgent data. (optional)

  8. Options: Provides some optional TCP functions, such as select packet, time stamp, etc. (optional)

In short, the size of the TCP header is 20 bytes, including the source port, destination port, serial number, confirmation number, data offset, reserved bits, control bits, window size, checksum, and options. The control bits include ACK, SYN, FIN, RST, PSH and URG six flags.

24. The size of the TCP header? What are the contents of the TCP header?

The size of the TCP header is 20 bytes, and the size of each field is 4 bytes.

The TCP header includes the following information:

  • Source Port: The port number used by the sender.
  • Destination Port: The port number used by the receiving end.
  • Sequence Number: The byte sequence number used to serialize the data stream.
  • Acknowledgment Number: The sequence number of the next datagram expected to be received.
  • Header Length: Indicates the length of the TCP header.
  • Reserved: The field is temporarily reserved and may be used in the future.
  • Control bits (Flags): Control bits such as SYN, ACK, and FIN are used to establish and terminate connections and perform error handling.
  • Window Size: The amount of data that the receiver can receive.
  • Checksum (Checksum): Used to detect whether there is an error in the TCP header and data during transmission.
  • Urgent Pointer: Used to indicate the location of urgent data.
  • Options: Occupies a variable number of bytes and contains some optional TCP features.

25. How to locate the memory leak?

  1. Use tools for detection: You can use some specialized tools to detect memory leaks, such as the Profiler tool in Java, Eclipse MAT (Memory Analysis Tool), etc. These tools help developers find memory leaks in their applications and provide detailed analysis and reporting.

  2. Code review: By carefully reviewing the code, look for places that may cause memory leaks, such as unclosed files, unreleased database connections, unreleased locks, etc. In addition, special attention needs to be paid to situations such as circular references and static variables, which are often one of the important reasons for memory leaks.

  3. Analyze the heap dump file: When the program has a memory leak, you can use the HeapDump tool (such as jmap) provided by the JVM to generate a heap dump file. The heap dump file can then be analyzed using MAT or other similar tools to find the cause of the memory leak.

  4. Code injection: Inject code that records stack and object information in the code to track down the source of memory leaks. This method can be used in non-real-time systems or test environments, but it needs to be used with caution to avoid affecting code performance

26. How to ensure that the hotspot data of redis will not expire

  1. Set an appropriate expiration time: For frequently accessed hotspot data, you can set a longer expiration time to ensure that they will not expire. When setting the expiration time, it is necessary to consider the business requirements and the performance of the hardware device to avoid excessive memory usage or adverse impact on system performance.

  2. Use Redis persistence function: Redis supports multiple persistence methods, including RDB and AOF. When using these persistence methods, hotspot data can be written to disk periodically to ensure that even if the Redis service crashes or restarts, the data will not be lost. Through the persistence function, data reliability and persistence can be achieved, and the risk of data loss can be reduced.

In addition to the above two methods, technologies such as Redis cluster and master-slave replication can also be combined to achieve high availability and reliability of hotspot data. For example, Redis cluster can be used to distribute hotspot data to multiple nodes to improve load balancing and availability; or Redis master-slave replication can be used to realize backup and read-write separation of hotspot data to improve concurrent processing capability and stability of the system sex.

27. Several ways of creating objects in Java

  1. Objects are created using the keyword new.

  2. Use the class reflection mechanism Class.newInstance() method. By calling the newInstance() method of the Class instance of the class, an object of the class can be dynamically created, provided that the class must have a no-argument constructor.

  3. Use the newInstance() method of the Constructor object. By obtaining the Constructor object of the class, and then calling the newInstance() method of the object, an object of the class can also be dynamically created.

  4. Use the clone() method. In Java, every object has a clone() method that can be used to create a copy of the object.

  5. Deserialize. By using the serialization and deserialization mechanisms in Java, an object can be written to a file or transmitted over the network, and then read from the file or network to obtain a new object

28. How to implement dynamic proxy

Interface-based dynamic proxy: implemented through Java's own Proxy class and InvocationHandler interface. Proceed as follows:

  1. Define an interface and a class that implements the interface;
  2. Implement the InvocationHandler interface, and rewrite the invoke() method, and process the method call of the target object in this method;
  3. Create a proxy object through the Proxy.newProxyInstance() method.

Class-based dynamic proxy: realized through CGLIB (Code Generation Library). Proceed as follows:

  1. Add dependency of cglib library;

  2. Define a target object (that is, the class to be proxied);

  3. Create an implementation class of the MethodInterceptor interface, and rewrite the intercept() method, in which the method call of the target object is processed;

  4. Create a proxy object through the Enhancer.create() method.

29. How to parse the Post request message? (incomplete answer)

In Java, frameworks such as Servlet API and Spring Framework can be used to parse HTTP POST request messages.

For the Servlet API, the parameters and content in the HTTP POST request message can be obtained through the HttpServletRequest object. Specific steps are as follows:

  1. Get the HttpServletRequest object.
  2. Call the request.getParameter() method to get the request parameter value.
  3. Call the request.getReader() method to obtain the character input stream of the request message content.
  4. Use a BufferedReader to read the contents of a character input stream.

For Spring Framework, Spring MVC provides a series of built-in parameter resolvers (Parameter Resolver), including annotations such as RequestBody and ModelAttribute, which can easily map parameters or content in HTTP POST request messages into Java objects. The specific method is as follows:

  1. The parameters of the specified controller method are annotated as @RequestBody or @ModelAttribute.

  2. Spring will automatically parse the parameters or content in the POST request and convert them into the type corresponding to the annotation.

30. How to check if the sql query is very slow

  1. Confirm whether the query statement is correct: check whether the SQL query statement is correct, including whether the table name, column name, etc., and whether the SQL syntax is correct, etc.

  2. Check whether the index is reasonable: index is an important means to improve query efficiency. You can use the EXPLAIN command to analyze query statements to understand how MySQL handles query statements, so as to determine whether to increase or optimize indexes.

  3. Optimize database parameters: Adjust some key parameters of MySQL, such as cache size, connection pool size, query cache, etc., to improve query efficiency. You can use the SHOW VARIABLES command to view the current parameter settings.

  4. Analyze slow query logs: By enabling the slow query log function and analyzing log files, you can find out query statements and related information whose execution time exceeds the threshold (10 seconds by default), and analyze and optimize them.

  5. Use the Profiling function: You can use the MySQL Profiling function to query

31. MySQL's master-slave replication process

1. MySQL master-slave replication is a common high-availability deployment scheme, which can synchronize the data of one MySQL server (called the master node) to multiple backup servers (called the slave nodes). Master-slave replication is usually achieved through the following steps:

  • First, enable the binary log (binlog) on ​​the master node, which records all changes to the database in the master node.
  • Then, configure the replication account and replication policy on each slave node to connect to the master node. Usually use the CHANGE MASTER TO statement to accomplish this task.
  • Once a slave node connects to the master node and starts replicating, it first fetches the current binlog file and location (the "replication point") from the master node.
  • The slave node then continuously reads the binlog of the master node, and applies the operations in it to its own database. This process is called "app event

32. The difference between clustered index and non-clustered index

  • Clustered index: Reorganize the data in the table according to the value of the index column, so that rows with similar index values ​​are physically stored together. Therefore, there can only be one clustered index, and it is usually the primary key index. Clustered indexes can improve query efficiency, but inserting, deleting, and updating data will cause data movement and affect performance.
  • Non-clustered index: The data in the table is not reorganized, but the value of the index column and the pointer to the corresponding row are stored in a separate data structure. Therefore, there can be more than one nonclustered index, and it is usually implemented using a normal index or a unique index. Nonclustered index can improve query efficiency, but it requires additional space to store index data

33. How much do you know about asymmetric encryption?

Asymmetric encryption (Asymmetric Encryption) is an encryption method that requires two keys: a public key and a private key. The public key is used to encrypt data and the private key is used to decrypt data. Asymmetric encryption is widely used in scenarios such as digital signature, TLS/SSL secure transmission protocol, and SSH remote login. Its advantage is to ensure the reliability and confidentiality of communication, but the disadvantage is that the speed of encryption and decryption is slow, so it is usually only used to encrypt a small amount of data

34. MVCC, specific process example

  • When a transaction begins, MySQL creates a unique transaction ID for the transaction.
  • When performing INSERT, UPDATE, DELETE and other operations, MySQL will not directly modify the original data, but will insert the new data into a new version, and record the creation timestamp and transaction ID of the version (that is, the version number ).
  • When other transactions read data, MySQL will select the appropriate data version to read according to the isolation level and version number of the transaction. If the old version is read, it will be blocked or re-read.
  • When a transaction is committed, MySQL will make all the changes made by the transaction permanent and release the lock of the transaction. At the same time, MySQL will also mark all versions created after the start of the transaction as "expired", so that they can be deleted in the subsequent cleanup process

36. How does Nginx view the backend Tomcat status

1. In Nginx, you can use the upstream module to manage the back-end Tomcat server. By setting the status check parameters of the upstream server, we can check the status of the backend Tomcat server. Specifically, the following code can be added to the configuration file:

Among them, the check command is used to set the parameters of the backend server status check. interval indicates how often the check is performed, rise indicates that the server is regarded as "healthy" after several consecutive checks succeed, and fall indicates that the server will be considered "healthy" after several consecutive checks fail. The server regards it as "downtime", timeout indicates the check timeout period, and type indicates the checking method, which is set to http here.

We can view the status of the Tomcat server by visiting the /status page of the Nginx server. If the status is up, it means that the server is "healthy"; if the status is down, it means that the server has been marked as "down".

37. How to allocate requests (polling strategy, minimum number of requests, priority ojbk)

  1. In a distributed system, the distribution of requests is usually implemented using load balancing techniques. Common load balancing strategies include Round Robin, Least Connections, IP Hash, etc.
  • Polling strategy: forward the request to each server in turn according to the order of the server list;
  • Minimum number of requests strategy: forward the request to the server with the least number of current connections;
  • IP hash strategy: calculate a hash value based on the source IP address of the request, and forward the request to the corresponding server

38. What do 32-bit and 64-bit specifically refer to;

32-bit and 64-bit mainly refer to the addressing capability of the CPU, that is, the maximum capacity that can handle memory. The addressing capability of a 32-bit CPU is 2^32, which is 4GB. The addressing capability of 64-bit CPU is 2^64, and the memory capacity that can be processed is extremely large, almost unlimited

39. Talk about the use of Redisson in combination with distributed lock projects

Distributed lock is a mechanism adopted in order to coordinate the use of resources in a distributed environment in a multi-node system. In a distributed system, due to the possible delay and loss of communication between nodes, the traditional lock mechanism is difficult to meet the concurrency requirements in a distributed environment. Therefore, we need distributed locks to ensure correct synchronization when multiple nodes access shared resources at the same time.

Redisson is a Java client for Redis that provides the ability to implement distributed locks. Specifically, Redisson supports the following types of distributed locks:

  • Reentrant Lock
  • Fair Lock
  • Interlock (MultiLock)
  • Red Lock
  • ReadWrite Lock (ReadWrite Lock)

Among them, reentrant locks and fair locks are similar to ReentrantLock and FairLock in Java, and are used to implement reentrant and fair mutual exclusion locks in a distributed environment. Interlocking can bind multiple locks together, so as to realize the requirement of acquiring multiple locks at one time. Red lock is an advanced solution for implementing distributed locks between different nodes, which can ensure that locks can still be obtained when most nodes are working normally. Finally, the read-write lock is similar to ReadWriteLock in Java, which is used to realize the concurrency control of read-write separation.

In Redisson, we can acquire the lock by calling the lock() method of the RLock object, and release the lock by calling the unlock() method. At the same time, Redisson also provides other rich functions, such as reentrant locks can set the expiration time, fair locks can set the waiting queue length, etc. In addition, Redisson also provides some advanced features, such as asynchronous task execution, delay queue, flow layout, etc.

40.redis master-slave replication,

1. Red is master-slave replication refers to the mechanism in which the master node synchronizes its own data to the slave nodes in the Redis cluster to achieve data backup and load balancing.

2. When the master node receives a write request, it will first update the data into its own memory, then send it to the slave node through the network, and record the offset of the sent information locally.

3. After receiving the data from the node, it will update it into its own memory, and record the offset of the information it has received locally.

4. When the master node fails or the network is unstable, the slave node can use the offset recorded locally to continue to request data from other slave nodes, thereby avoiding data loss

41. The implementation mechanism of redis atomicity (single thread does not respond)

The atomicity of Redis is realized through single-threaded execution and transaction mechanism. The main steps are as follows:

  1. The client sends the MULTI command to start a new transaction and pack multiple operations into a queue.

  2. Redis buffers these operations and returns a "QUEUED" response.

  3. The client continues to send other commands, which Redis puts in the queue.

  4. The client sends the EXEC command, which means submitting the entire transaction.

  5. Redis executes all the commands in the queue in turn, and if any of the commands fail to execute, the entire transaction will be rolled back.

  6. Redis packs all the results into an array and returns it to the client.

During the transaction, the client can also use the WATCH command to monitor one or more keys. When these keys are modified during the transaction, the transaction will be terminated and an error response will be returned. At the same time, you can also use the DISCARD command to cancel a transaction.

42. The leftmost prefix, clustered index, non-clustered index, query back to the table, etc.

Leftmost prefix:

        The leftmost prefix refers to the index field used in the query, which means that the index column is continuously selected from the left as a condition for filtering until a column that is not in the index column is selected. The leftmost prefix can efficiently locate the records that meet the query conditions

Application scenario: When the query condition contains multiple columns, the same column as the index column can be preferentially selected as the condition filter to improve query efficiency

Code example:

Clustered index:

        A clustered index is a data storage method that stores table data on disk in the order represented by the index key. A table can only have one clustered index, because table data can only be sorted in one way, and it is in physical storage order​​​​​​​​​​

Application scenario: It is suitable for the situation that often needs to be sorted or grouped by a certain column, which can greatly improve the query speed.

Code:

Non-clustered index:

        A non-clustered index is also a data storage method that stores index keys and corresponding row pointers on disk. A table can have multiple non-clustered indexes, because each index can be sorted and searched independently​​​​​​​​

Application scenario: It is suitable for the situation that often needs to use specific columns for query, which can improve the query speed

Code display:

Return table query:

        The query back to the table means that after the content of the index column is retrieved according to the non-clustered index, the complete data row needs to be found again through the row pointer. This adds extra IO operations and processing time

Application scenario: When the column to be queried is not in the index column, it is necessary to query back to the table, which will increase additional IO operations and processing time. Therefore, when designing an index, try to include all columns that need to be queried

Code display:

​​​​​​​

43. What is saved on the registration center? (nacos as the registration center)

  1. The registration center is usually a component in the microservice architecture, which is used to manage and coordinate various microservice instances. It usually saves some of the following information:
  • The network location information of the microservice instance (such as IP address, port number, etc.)
  • Metadata information of microservice instances (such as service name, version number, environment information, etc.)
  • Health status information of microservice instances (such as survival status, load status, etc.)
  • Routing information of microservice instances (such as routing rules, load balancing policies, etc.)

44. Are L1, L2, and L3 caches shared by all CPU cores? Since L1 and L2 caches are private to different cores, how do different cores L1 and L2 caches communicate?

L1, L2, and L3 caches are all CPU caches, but they are shared differently among them.

1. The L1 cache is private to each CPU core, which means that if a CPU has multiple cores, there will be multiple L1 caches. The L2 cache is also usually private to each core, but in some architectures, it may also be designed so that multiple cores share an L2 cache. Finally, the L3 cache is shared across the entire CPU chip (some architectures may have multiple L3 caches), so all CPU cores can access the same L3 cache.

2. When different cores need to access each other's caches, a cache coherence protocol is required. These protocols ensure the correctness of access to data in main memory between different cores. In a cache coherence protocol, when a core modifies data in shared memory, it sends a signal to the other cores telling them that their caches need to be updated.

3. Specifically, the communication between the L1 and L2 caches usually uses a bus protocol, while the communication between the L3 caches uses a network interconnection protocol. These protocols ensure the consistency of the data in the cache among the cores, avoiding data conflicts and race conditions.

45. What is CAS? How does the Unsafe class implement CAS at the operating system level? Will the CAS keep spinning?

1. CAS (Compare and Swap) is an optimistic locking technology used to achieve lock-free synchronization in a multi-threaded environment. A CAS operation has three operands: memory location V, expected value A, and new value B. If the value of memory location V is equal to the expected value A, update the value of memory location V to the new value B; otherwise, do nothing.

2. In Java, CAS can be operated through the Atomic class or Unsafe class provided by the java.util.concurrent.atomic package. The Unsafe class is a tool class similar to C language pointers, which provides a set of low-level, unsafe operation methods, including CAS operations. At the operating system level, CAS is an atomic operation instruction supported by the CPU, such as the CMPXCHG instruction on the x86 architecture.

3. When the Unsafe.compareAndSwapInt() method is called, the current value will be read from the memory address first, and then compared with the expected old value. If they are equal, write the new value into this memory address, and return true to indicate that the modification is successful. Otherwise, return false, indicating that the modification failed. Since CAS is a busy waiting method, if many threads try to modify the same memory location at the same time, it will cause a large number of spin operations to waste CPU resources.

4. In order to avoid performance problems caused by spin, JDK8 introduces adaptive spin technology. This technology dynamically adjusts the number of spins according to the historical execution time of the current CAS operation and the current thread scheduling situation, thereby reducing the performance loss caused by the spin. In addition, JDK8 also introduces optimization technologies based on CPU Cache and lock expansion mechanism, which can significantly improve the performance and concurrency of CAS.

46. ​​What should I do if the data packet is lost in the second TCP handshake? What is the state of the client and server at this time?

TCP's three-way handshake is the process of establishing a connection. The second handshake is that the client sends a SYN (synchronization) packet to the server and waits for the server to reply with a SYN+ACK (synchronization confirmation) packet. If the data packet is lost during this process, the client will always wait for the reply from the server, and the server will not receive the SYN packet sent by the client, so the connection cannot be established.

In this case, the client and server are in different states:

  • The client is in the SYN_SENT state, which means it has sent a SYN packet to the server, but has not received a response from the server;
  • The server will not enter any state, because it has not received the SYN packet from the client.

In order to solve this problem, the TCP protocol uses a timeout retransmission mechanism. After the client sends a SYN packet, if it does not receive a reply from the server within a certain period of time, it will resend the SYN packet. After receiving the SYN packet, the server will reply with a SYN+ACK packet for the third handshake. If the SYN+ACK packet is also lost, the client will continue to wait for a while and retry until the connection is successfully established or the maximum number of retries is reached

47. A Mysql insert statement, the process of saving data on the server side? What is the difference between binlog and redolog?

When the MySQL server receives an insert statement, it will process it according to the following process:

  1. The server first parses the statement into a corresponding syntax tree, and performs a semantic check to ensure the correctness and legality of the statement.

  2. Then, the server will convert the statement into the corresponding internal format and store it in the cache for later execution.

  3. When the statement needs to be executed, the server will send it to the storage engine, and the storage engine will write the data into the memory buffer.

  4. If the table is transaction-enabled, the MySQL server will record the operation in a memory buffer, but will not persist it to disk immediately. At this point, the transaction log (Transaction Log) will record this operation for use when rolling back or restoring the operation.

  5. When the transaction is committed, the MySQL server will flush all the data in the memory and the transaction log to the disk to ensure the persistence and consistency of the data.

Binlog and redolog are two different log files in MySQL. Their main function is to record the operation and status change information in MySQL to realize functions such as data recovery, backup and replication. Their main differences are as follows:

  1. The binlog (binary log) records all modification operations on the database, which can be used for data recovery and backup. The redolog (redo log) only records the operations of ongoing transactions, used to achieve crash recovery and disaster recovery.

  2. Binlog is a logical log, which records the results of SQL statement execution, while redolog is a physical log, which records the modification records of disk blocks. Therefore, binlog can be replicated across platforms, but redolog cannot.

  3. Binlog records logical information, so its overhead is relatively small, but the disadvantage is that SQL statements need to be re-executed during data recovery, which is slow. However, redolog records physical information, so its overhead is high, but it is faster in data recovery, because it only needs to restore disk blocks to a certain point in time.

In short, both binlog and redolog are important log files in MySQL. Their main function is to record the operation and status change information in MySQL to realize functions such as data recovery, backup and replication. The difference between them is mainly in what is recorded, performance overhead, and how it is used.

48. Spring MVC process? Why do all requests go through DispatcherServlet?

The process of SpringMVC is as follows:

  1. Client sends request to DispatcherServlet.

  2. After receiving the request, DispatcherServlet selects an appropriate Controller for processing according to the URL path of the request.

  3. The Controller handles the request and returns a ModelAndView object.

  4. DispatcherServlet converts the ModelAndView object into a specific view through the view resolver, and sends the response to the client.

In SpringMVC, all requests go through DispatcherServlet. This is because DispatcherServlet acts as a front controller (Front Controller), which is responsible for receiving all requests and coordinating the various components of the entire process,

Specifically, DispatcherServlet will use HandlerMapping to map the request to the corresponding Controller for processing according to the URL path of the request. Then, the Controller will generate a ModelAndView object according to the request parameters and business logic, which contains the view name and model data to be rendered. Finally, DispatcherServlet passes the object to ViewResolver, converts it into a concrete view, and sends the response to the client.

In short, in SpringMVC, DispatcherServlet is the core of the whole process, which is responsible for receiving and dispatching all requests, and coordinating the various components of the whole process, so as to realize flexible and efficient web applications.

49. What is the underlying implementation of Spring AOP? A dynamic proxy needs to implement the InvocationHandler interface. How is the bottom layer of InvocationHandler implemented?

The underlying implementation of Spring AOP is based on JDK dynamic proxy and CGLIB code generation technology.

1. When using Spring Bean of pure Java type, Spring AOP uses JDK dynamic proxy technology. JDK dynamic proxy is to dynamically generate a proxy class that implements the proxy interface through the reflection mechanism, enhance the logic of the original method in the proxy class, and execute the enhanced logic before and after executing the original method. JDK dynamic proxy requires that the proxied class must implement at least one interface, so only the methods defined in the interface can be proxied.

2. For classes that do not implement interfaces, Spring AOP uses CGLIB code generation technology to implement dynamic proxies. CGLIB (Code Generation Library) is a powerful code generation library, which can dynamically generate subclasses of specified classes at runtime and override the methods in them to implement dynamic proxy. By inheriting the proxy object, CGLIB can proxy all methods of the class, including final methods.

3. Both the JDK dynamic proxy and CGLIB technology need to implement the InvocationHandler interface, which has only one invoke() method, which will be called when the proxy object invokes a method. In the invoke() method, the proxy method can be enhanced, and the enhanced logic can be executed before and after the original method. Specifically, when the proxy object calls a method, the invoke() method will receive the Method object of the proxy method, the parameter list and the proxy object itself, execute the enhanced logic through the reflection mechanism, and return a return value of the proxy method.

4. In short, the underlying implementation of Spring AOP is based on JDK dynamic proxy and CGLIB code generation technology, and enhances the original method through the InvocationHandler interface

50. MVCC? How to obtain the active transaction ID list of readview?

MVCC (Multi-Version Concurrency Control) is a multi-version concurrency control mechanism used to ensure data consistency and repeatable reading in a concurrent environment.

In MVCC, each transaction will get a unique transaction ID (Transaction ID). When the transaction modifies the data in the database, a new version will be generated and associated with the transaction ID. Therefore, when a transaction executes a query operation, it can only see the version generated by the committed transaction, but not the version generated by the transaction that has not yet been committed. This mechanism can effectively avoid concurrency problems such as dirty reads and non-repeatable reads.

readview is one of the key components of MySQL's implementation of MVCC, which is used to record the historical versions visible to the current transaction. readview consists of two parts:

  1. Version linked list: contains all submitted transaction IDs and their version numbers generated when the transaction is committed.

  2. Active transaction ID list: contains the currently active transaction IDs, that is, the transaction IDs that have not yet been committed.

When a transaction starts, MySQL will generate a readview based on all active transactions in the current system and bind it to the transaction. Once a readview is generated, it remains constant throughout the execution of the transaction. When a transaction executes a query operation, MySQL will determine the visible historical version of the transaction according to the version list and active transaction ID list in readview.

The method of obtaining the list of active transaction IDs is as follows:

  1. Traverse the transaction list of the InnoDB storage engine to get all currently active transaction IDs.

  2. If a transaction has been submitted (commit), but has not released the lock resource, then it is also considered an active transaction.

  3. When executing the query, determine the visible historical version of the transaction according to the version list and active transaction ID list in readview.

51. Have you ever used atomic types like AtomicInteger? What are the flaws of CAS?

The CAS algorithm is an optimistic locking technology, which judges whether it needs to be updated by comparing whether the value in memory is equal to the expected value. If equal, update the value in memory with the new value; otherwise, do nothing. Since the CAS operation is completed in an atomic operation, the atomicity and thread safety of the operation can be guaranteed.

However, CAS also has some drawbacks:

  1. ABA problem: CAS can only detect if the value has changed, but not the process of changing the value. For example, thread A modifies the value from 1 to 2, and then modifies it back to 1. At this time, when thread B modifies the value, it will still think that it has not been modified, resulting in an ABA problem.

  2. Spin time is too long: CAS operations with a large amount of concurrency may cause too long spin time, seriously affecting system performance.

  3. Only the atomicity of a single variable can be guaranteed: CAS can only implement atomic operations on a single variable, and cannot implement compound operations on multiple variables.

52. What is the underlying implementation of HashMap type put and get?

In HashMap, the put() method will store the key-value pair into the hash table. The specific implementation steps are as follows:

  1. First, calculate a hash code (that is, the corresponding array subscript) according to the hashCode() method of the key.

  2. If the location is not occupied, the key-value pair is directly stored in the location and null is returned.

  3. If the position is already taken, you need to check if the key at that position is the same as the key to be inserted. If they are the same, update the value at that location; otherwise, use open addressing or chain storage to resolve hash conflicts.

  4. If the size of the current HashMap is greater than or equal to the load factor (the default is 0.75), the hash table needs to be expanded. After expansion, the original key-value pair needs to be re-hashed and placed in a new location.

The get() method will return the corresponding value according to the specified key. The specific implementation steps are as follows:

  1. The hash code is calculated according to the key's hashCode() method.

  2. Find the location corresponding to the hash code in the hash table. Returns null if the location is empty.

  3. If the position is not empty, you need to check whether the key at the position is the same as the key you are looking for, according to the key's equals() method. If they are the same, return the value at this position; otherwise, use the open addressing method or chain storage method to continue searching in the hash table.

In short, in HashMap, the put() method implements the insertion operation by mapping the key-value pair to an index position in the hash table, while the get() method finds corresponding value. At the same time, in order to resolve hash conflicts, HashMap can be optimized using technologies such as open addressing or chain storage

53. Do you know about sensitive word filtering? Simply put. (I answer: prefix tree), ask: What are the optimization schemes for prefix tree? Simply put.

Sensitive word filtering refers to finding and filtering out sensitive words in the text entered by users to protect the legitimate rights and interests of users and other stakeholders. Prefix tree is a commonly used sensitive word filtering algorithm, which can find whether a string of length m is a sensitive word in O(m) time complexity.

Prefix Tree, also known as Trie Tree, is an automaton data structure for efficiently storing and finding collections of strings. In a prefix tree, each node represents a prefix of a string, and the path from the root node to a leaf node represents a complete string. In order to filter sensitive words, we can build a sensitive word library into a prefix tree, and check whether there is a certain string in the prefix tree in the text entered by the user.

In addition to the basic prefix tree algorithm, there are some optimization schemes to improve the efficiency of the prefix tree, such as:

  1. Compressed prefix tree: Merge multiple nodes with only one child node into one node, which can reduce space overhead and query times.

  2. Double-array prefix tree: Using two arrays to represent the state transition of the prefix tree can greatly improve performance.

  3. Aho-Corasick algorithm: Combine multiple prefix trees into an AC automaton, which can quickly find whether there are multiple sensitive words in a string.

In short, prefix tree is an efficient algorithm when filtering sensitive words. In practical applications, appropriate optimization schemes can be used according to specific scenarios to improve efficiency and accuracy.

54. Tell me about the red-black tree? What is the difference between him and a balanced binary tree?

A red-black tree is a self-balancing binary search tree in which each node is marked as red or black. A red-black tree satisfies the following properties:

  1. Each node is either red or black.
  2. The root node is black.
  3. Each leaf node (NULL node) is black.
  4. If a node is red, both of its children must be black (and not necessarily vice versa).
  5. All paths from any node to each of its leaf nodes contain the same number of black nodes.

The main difference between a red-black tree and a balanced binary tree is the difference in the balancing strategy. The balanced binary tree maintains the balance of the height of the tree, while the red-black tree maintains more properties, so that the complexity of the operation can be guaranteed at the O(log n) level.

Specifically, in the balanced binary tree, the position of the node is adjusted through the rotation operation to achieve balance, while the red-black tree achieves balance through the transformation of the color of the node and the rotation operation. In this way, during basic operations such as insertion and deletion, the red-black tree can adjust the balance faster than the balanced binary tree, so that the height of the entire tree can be controlled at the O(log n) level. Therefore, the red-black tree is a more efficient self-balancing binary search tree.

55. Please design an algorithm for elevator scheduling, and briefly talk about the principle

Common elevator scheduling algorithms are as follows:

  1. FCFS (First Come First Serve) scheduling algorithm: According to the order of passenger requests, assign them to idle elevators or running elevators. This algorithm is relatively simple, but it cannot adapt well to the demand of the peak period.

  2. SSTF (Shortest Seek Time First) scheduling algorithm: select the idle elevator closest to the current floor for service. It can reduce the waiting time for passengers, but there may be cases where some elevators are idle for a long time.

  3. SCAN scheduling algorithm: The elevator starts running at one end until it reaches the other end, and then returns to the origin. In this process, the service is performed in the order of floors from low to high. This algorithm can ensure that the service time of each passenger is relatively fair, and can avoid the situation that some elevators are idle for a long time.

  4. C-SCAN (Circular SCAN) scheduling algorithm: similar to the SCAN scheduling algorithm, but the elevator returns to the origin immediately when it reaches the other end, and it can only serve in one direction. This algorithm can avoid passengers who have been waiting for a long time.

56. After establishing a TCP connection, what happens when the client goes offline

  1. The server cannot perceive that the client is offline: TCP is a connection-oriented protocol, and can only perceive the change of the other party's state when the connection is closed normally at both ends. Therefore, if the client is forcibly disconnected or crashes abnormally, the server will not be able to perceive the status of the client offline in time.

  2. The connection resources of the server may be wasted: Since the server cannot detect that the client is offline in time, the connection resources on the server may not be released, resulting in a waste of resources. This waste of resources can bring down the server if a large number of clients are connected to the server at the same time.

  3. The server will send a certain amount of data: If the server is sending data to the client when the client goes offline, the server will continue to send a certain amount of data (called "storm data") until it times out or receives a confirmation message. These storm data may also cause problems such as network congestion and resource waste.

Therefore, in practical applications, it is necessary to use technologies such as heartbeat mechanism and disconnection reconnection to deal with the situation of client offline, so as to ensure the reasonable utilization of connection resources and the stability of the system.

58.springboot thread pool creation

In Spring Boot, you can create thread pools by configuring ThreadPoolTaskExecutor or TaskScheduler. Here are the basic steps:

  1. Define a bean in your Spring Boot application configuration class and specify its type as ThreadPoolTaskExecutor or TaskScheduler.

  2. Configure thread pool parameters, such as the number of core threads, maximum number of threads, queue capacity, etc.

  3. Use a thread pool to perform tasks that require asynchronous processing.

The following is an example of creating a thread pool using ThreadPoolTaskExecutor:

In this example, we implement the AsyncConfigurer interface and override the getAsyncExecutor method to create a thread pool. Then we set the number of core threads, maximum number of threads and queue capacity of the thread pool, and initialize it.

Then, add the @Async annotation to the method that needs to be executed asynchronously to use the thread pool to execute asynchronous tasks: 

 

59. Single sign-on

Single sign-on (Single Sign-On, referred to as SSO) means that users only need to log in once to access multiple mutually trusted application systems. Here are the basic steps to implement single sign-on:

  1. Create an authentication center (Identity Provider, referred to as IdP), which is responsible for handling user authentication and issuing tokens (Token).

  2. Create a service provider (Service Provider, SP for short) on each application system that requires single sign-on, and the service provider receives the token from the authentication center and verifies its legitimacy.

  3. When a user tries to access an application, the application redirects the user to an authentication authority and includes an identifier in the URL.

  4. The authentication center pops up the login interface, prompting the user to enter the user name and password for identity verification. If the verification is successful, the certificate authority generates a token and sends it back to the service provider.

  5. After receiving the token, the service provider verifies its legitimacy. If the token is valid, the user is allowed to access the application system.

  6. If the user tries to access other application systems, repeat the above steps.

In general, implementing single sign-on requires two main tasks: authentication and token passing. Authentication authorities handle authentication and issue tokens, while service providers receive tokens and verify their legitimacy

 In this way, OAuth2-based single sign-on can be realized. When the user visits the service provider, the system will automatically redirect to the authentication center for authentication. If the verification is successful, the certificate authority generates a token and sends it back to the service provider. After the service provider receives the token, it verifies its legitimacy and allows the user to access the application

60. Mybatis cache

MyBatis is a popular persistence layer framework that provides a caching mechanism to improve system performance. MyBatis cache is mainly divided into two types: first-level cache and second-level cache.

  1. Level 1 cache: Also known as local cache, it is the cache enabled by MyBatis by default. It is thread-based and shared in the same SqlSession. When the same query is performed multiple times, MyBatis can directly obtain data from the local cache, avoiding the performance loss caused by frequent access to the database.

  2. Second-level cache: The second-level cache is based on the namespace and can share the cache across SqlSessions. When a SqlSession is executed, the query result will be written into the secondary cache, and the next time other SqlSession executes the same query operation, it can directly obtain data from the cache, avoiding the overhead of repeatedly executing SQL.

When using MyBatis cache, you need to pay attention to the following points:

  1. Cache effectiveness: MyBatis's cache mechanism is enabled by default, but sometimes the cache is not up to date. Therefore, it is necessary to consider how to control the effectiveness of the cache, such as performing regular cleaning, manually refreshing the cache, or setting the cache expiration time.

  2. Cache granularity: MyBatis' caching mechanism is granular with namespace or SqlSession. Depending on the granularity, the effect of the cache will also be different. It is necessary to select the appropriate cache granularity according to the specific project requirements.

  3. Cache clearing: MyBatis's cache mechanism provides a variety of ways to clear the cache, including time-based, object-based, or manual clearing. It is necessary to select the appropriate removal method according to the specific situation.

61. How to solve the problem that your mysql stores a large amount of data?

If the amount of data stored in MySQL is large, it may cause problems such as database performance degradation and slow query speed. The following are several ideas to solve the problem of large data volume:

  1. Database partitioning: Splitting a large table into multiple small tables and storing these small tables in different partitions can improve the query performance and availability of the database. At the same time, it is also necessary to consider how to divide partitions according to business needs, and how to dynamically adjust the partition scheme when the amount of data expands.

  2. Index optimization: Adding appropriate indexes can improve the efficiency of database queries. However, too many or unnecessary indexes will affect write operations, so it is necessary to balance the number of indexes and query requirements.

  3. SQL optimization: Optimizing SQL query statements can reduce the load on the database and improve query efficiency. For example, avoid full table scan, try to use simple JOIN, use UNION ALL instead of UNION, etc.

  4. Sub-database and sub-table: storing data in multiple independent databases can reduce the pressure on a single database and improve concurrent processing capabilities. However, sub-database and sub-table will also bring some problems, such as transaction processing, cross-database query and other issues.

  5. Cache optimization: Cache technology can store hot data in memory, avoid frequent database read operations, and improve system response speed and concurrency performance.

  6. Database server upgrade: increasing the processing power, storage capacity, bandwidth and other hardware resources of the database server can improve the load capacity and performance of the system.

  7. Data backup and recovery: Backing up data in time can avoid data loss and the impact of system failure. At the same time, it is also necessary to consider how to quickly restore large-scale data to ensure the normal operation of the system.

62. How does redis deal with high concurrency, several ideas

As an in-memory database, Redis has good support and optimization ideas for high-concurrency scenarios. The following introduces several common ideas for Redis to handle high concurrency:

  1. Message queue: Redis can be used as a message queue to handle high concurrency scenarios. By storing tasks in Redis and using multiple consumers to process tasks simultaneously, the concurrent performance and processing power of the system can be improved.

  2. Distributed lock: Redis' distributed lock mechanism can be used to ensure the atomicity and consistency of data. By adding locks, only one client can access shared resources to avoid problems caused by concurrent operations.

  3. Caching: Redis's caching mechanism can be used to speed up read operations and reduce access to back-end databases. By storing hot data in Redis, the pressure on the database can be greatly reduced and the system performance can be improved.

  4. Cluster: Redis supports cluster mode, which can improve the load capacity and availability of the system by expanding the number of nodes horizontally. At the same time, it is also necessary to pay attention to issues such as data synchronization in the cluster and handling of node failures.

63. Redis implementation principle

Redis is a high-performance open source memory database that uses key-value storage and supports multiple data types and complex data structures. The following is the implementation principle of Redis:

  1. In-memory database: Redis stores all data in memory, so it has extremely fast read and write speeds. At the same time, in order to avoid problems such as memory overflow, Redis also provides a persistence mechanism to write data to disk regularly.

  2. Single-threaded model: Redis uses a single-threaded model to process client requests, and can improve system concurrency performance and response speed by using non-blocking I/O, event-driven and other technologies.

  3. Client/server model: Redis adopts the client/server model, which can serve multiple clients at the same time and communicate through network protocols (such as TCP/IP).

  4. Multiple data types: Redis supports multiple data types, including strings (String), hash tables (Hash), lists (List), sets (Set) and ordered sets (Sorted Set). Each data type has specific operation commands, such as GET, SET, HGETALL, LPUSH, SADD, and ZRANGE, etc.

  5. Atomicity of operations: Most operations in Redis are atomic, which means that when these operations are performed, either all of them are successfully completed, or all of them fail and roll back. This ensures data consistency and reliability.

  6. High availability: Redis provides a variety of high availability solutions, including master-slave replication, sentinels, and clusters. Through these mechanisms, the availability and fault tolerance of the system can be improved.

64. How to do database optimization

Database optimization is one of the key steps to improve system performance and availability. Several common database optimization methods are listed below:

  1. Index optimization: Indexes are an important factor in database query performance, and query efficiency can be improved by adding indexes to frequently accessed columns. However, too many or unnecessary indexes will affect the write operation, and the number of indexes and query requirements need to be weighed.

  2. SQL optimization: Optimizing SQL query statements can reduce the load on the database and improve query efficiency. For example, avoid full table scan, try to use simple JOIN, use UNION ALL instead of UNION, etc.

  3. Database connection pool optimization: The connection pool is a key component of database connection management, and the performance of the connection pool can be optimized by adjusting parameters such as the maximum number of connections and the recovery time of idle connections.

  4. Sub-database and sub-table: storing data in multiple independent databases can reduce the pressure on a single database and improve concurrent processing capabilities.

  5. Cache optimization: Cache technology can store hot data in memory, avoid frequent database read operations, and improve system response speed and concurrency performance.

  6. Regularly clean up useless data: Timely cleanup of useless data can free up database space, improve query and update efficiency, and reduce backup and recovery time and costs.

  7. Database security optimization: Ensuring data security and integrity is an important issue that any database system must consider. The security of the database can be improved by restricting access rights and using encryption technology.

65. Transaction principle

The implementation principle of the transaction can be described by the following four keywords:

  1. Start transaction (BEGIN): Before the transaction starts, you need to use the BEGIN keyword to start a transaction and initialize related parameters.

  2. Execute operations (EXECUTE): Execute a series of database operations in a transaction, such as SELECT, INSERT, UPDATE, DELETE and other operations. These operations will be recorded in a log buffer and correspondingly modified in the database.

  3. Commit transaction (COMMIT): If there is no error during the execution of the operation, you can use the COMMIT keyword to submit the transaction, and all modification operations will be permanently written to the database.

  4. Rollback transaction (ROLLBACK): If an error or other abnormal situation occurs during the execution of the operation, you can use the ROLLBACK keyword to roll back the transaction, undo all the operations that have been performed, and restore to the state before the transaction started.

In practical applications, in order to improve the concurrency performance and reliability of transactions, the following two technical means are usually used:

  1. Optimistic Locking: At the beginning of the transaction, record the version number or timestamp of the data; when committing the transaction, check whether the version number or timestamp of the data is consistent. If it is not consistent, it means that other transactions have modified the data during the execution of the transaction and have submitted it. At this point, the current transaction needs to be rolled back and re-executed.

  2. Pessimistic Locking: At the beginning of a transaction, use SQL statements such as SELECT ... FOR UPDATE or SELECT ... LOCK IN SHARE MODE to lock the data to be modified to prevent other transactions from modifying the same row of data at the same time. After the transaction is committed or rolled back, the corresponding lock is released.

In short, transactions are an important mechanism to ensure the consistency and reliability of database operations. In practical applications, it is necessary to combine specific business requirements and performance characteristics, and adopt technical means such as optimistic locking and pessimistic locking to improve the concurrency performance and reliability of transactions

66. Redis has a key with an expiration time. At this time, there are always commands from the client to set this key. What is the problem?

  1. Invalid expiration time: If the client keeps updating the value of this key, then the expiration time will become invalid, causing this key to never expire.

  2. Performance loss: Every time the client updates the value of this key, it will cause Redis to execute a SET command and write the value into memory. If the client frequently updates the value of this key, it will lead to frequent writing and recovery operations of Redis memory, which will reduce the performance of the system.

  3. Memory leak: If the client updates the value of this key much faster than the expiration time, there may be a large number of expired but not deleted keys in the Redis memory, resulting in a memory leak. In this case, you need to use Redis's memory elimination strategy or manually clear expired keys to avoid problems such as memory overflow.

67. How to solve million QPS login?

  1. High-performance hardware devices: First of all, high-performance hardware devices, such as high-speed network interface cards and high-speed disks, need to be used to meet the processing requirements of massive requests.

  2. Load balancing: In order to share the pressure of concurrent requests, load balancing technology can be used to distribute incoming requests to multiple servers for processing. Common load balancing algorithms include polling, random, and the least number of connections.

  3. Distributed cache: Use distributed cache technology (such as Redis, Memcached, etc.) to cache user login status information, reduce the number of database accesses, and improve system response speed and concurrent processing capabilities.

  4. Cluster deployment: deploy the system on multiple servers to form a cluster to realize distributed processing and improve system availability and scalability. Care needs to be taken to avoid issues such as data inconsistencies.

  5. Asynchronous processing: Use asynchronous processing technologies (such as message queues, thread pools, etc.) to process login requests asynchronously, reduce system response delay and resource consumption, and improve system stability and pressure resistance.

  6. Optimize SQL query: try to avoid complex SQL query operations and a large number of database read and write operations, and improve database query efficiency and throughput through index optimization, sub-database sub-table and other technologies.

  7. Current limiting control: When the system has instantaneously high concurrent requests, it is necessary to adopt current limiting measures (such as token bucket, leaky bucket algorithm) to control the number of concurrent requests, so as to ensure that the system will not crash or respond slowly due to too many requests

68. The difference between final and finally and finalize

1.final, finally and finalize are three completely different concepts. final is used to declare immutable variables,

2.finally is used to specify the code block that will be executed regardless of whether an exception occurs in the try-catch structure,

3. finalize is a method name, a method defined by the Object class, which is used to perform necessary operations before the garbage collector clears the object.

69. After calling System.gc, will Java memory be reclaimed immediately (I just vaguely remember that this command can directly trigger FullGC)

1. Calling System.gc()a method just sends a garbage collection request to the JVM, but there is no guarantee that it will be recycled immediately. In fact, the JVM will decide when to perform garbage collection according to its own garbage collection algorithm and strategy.

2. Under normal circumstances, when the Java memory reaches a certain threshold, or when the free memory is insufficient, the JVM will trigger garbage collection. In addition, during program execution, when some objects become unreachable (that is, there are no references to them), they may also be collected immediately.

3. But it should be noted that it is not a good habit to call the method too frequently System.gc(), because it will cause the system to spend too much time on garbage collection, which will affect the performance of the program. It is recommended to call this method only when it is really necessary to force garbage collection

70. The difference between optimistic locking and pessimistic locking (answered, but the answer is not clear enough)

1. Optimistic lock

Optimistic locking assumes that no other thread will modify the resource when accessing the shared resource, so it will not actively lock it, but check whether the data has been modified by other threads during the update operation. If it has not been modified, the update was successful; otherwise, the operation needs to be rolled back or the update retried.

In general, optimistic lock implementations need to record information such as version numbers or timestamps to determine whether the current data has been modified by other threads. In Java, the commonly used optimistic lock implementation method is the CAS (Compare and Swap) algorithm , which provides atomic-level operations by hardware to ensure data consistency.

Code display:

2. Pessimistic lock

Pessimistic locking believes that other threads will modify the resource when accessing the shared resource, so it actively locks to ensure data consistency. In Java, the commonly used pessimistic lock implementation methods are the synchronized keyword and the ReentrantLock class.

Code:

the difference:

The difference between optimistic locking and pessimistic locking lies in the way the locking mechanism is used. Optimistic locking assumes that there will be no competition when accessing shared resources, so it does not actively lock; while pessimistic locking believes that there will be competition when accessing shared resources, so active locking is required to ensure data consistency.

Scenario application:

Optimistic locking is usually suitable for scenarios with frequent read operations and few write operations, such as caching, etc.; while pessimistic locking is usually suitable for scenarios with frequent write operations, such as database transactions, etc.

71. How to implement optimistic lock and pessimistic lock in sql language

1. Implementation of optimistic locking:

(1) Add a version number field: add a version number field version to the table to record the number of data updates.

(2) Obtain the current version number: When querying data, return the version number to the application.

(3) Compare the version number when updating data: when updating data, first compare whether the current version number is consistent with the version number in the database. If they are consistent, perform an update operation and increase the version number by 1; otherwise, it means that other threads have modified the data and need to be processed accordingly.

2. Implementation of pessimistic locking:

(1) Use SELECT ... FOR UPDATE or SELECT ... FOR SHARE: When querying data, use SELECT ... FOR UPDATE or SELECT ... FOR SHARE to lock the query results to avoid other transactions while accessing or modifying this data.

(2) Execute the transaction: perform read and update operations in the transaction, and then commit the transaction. Before the transaction is committed, the transaction will always hold the lock of the row data, and other transactions cannot modify this data.

In short, to implement optimistic locking and pessimistic locking in the SQL language, different implementation methods need to be selected according to specific business needs, such as using the version number mechanism to implement optimistic locking or using SELECT ... FOR UPDATE or SELECT ... FOR SHARE to implement pessimistic locking, etc. . However, it should be noted that excessive optimistic locking or pessimistic locking may affect system performance, so reasonable trade-offs and optimizations are required.

72. Why use JWT, what are the common login methods (cookie+session, redis+token, JWT)

JWT (JSON Web Token) is a standard for authentication and is widely used in modern applications. It encrypts the user's identity and authorization information and encapsulates it into a token, which is convenient for transmission and verification between the client and the server, and has the following advantages:

  1. Stateless: JWT encapsulates the user's identity and authorization information into a token, and does not need to save state information such as session on the server side.

  2. Cross-platform: Since JWT is a standard based on JSON format, it can be parsed and generated on multiple programming languages ​​and platforms.

  3. Security: JWT uses powerful algorithms to sign or encrypt tokens to ensure data security and integrity.

There are three common ways to implement login:

  1. Authentication method based on cookie+session: After the user logs in, the server records the session id in the cookie, and stores the session data on the server. After each request brings the session id, the server can find the corresponding session id according to the session id. session data.

  2. Authentication method based on redis+token: After the user logs in, the server generates a token and stores the token in redis, and after each request brings the token, the server can find the corresponding user information based on the token.

  3. JWT-based authentication method: After the user logs in, the server generates a JWT token, encrypts the user's identity and authorization information into the token, and sends it to the client. In the future, each request will bring the token, and the server can decrypt the user information for authentication.

Compared with other methods, JWT has the advantages of being stateless, cross-platform, and easy to expand, and can support multiple algorithms for signing or encryption. It is a very good authentication mechanism

74. The difference between cookie and session (according to four aspects, location, security, life cycle, save data type)

Both Cookie and Session are common authentication mechanisms in web applications, but they are implemented slightly differently.

  • Location: Cookies are stored in the client browser, while Sessions are stored on the server side.
  • Security: Since cookies are stored on the client side, they are vulnerable to malicious attacks. Session is relatively more secure because it is stored on the server side and can be protected by methods such as encryption.
  • Lifecycle: A cookie can be set with a fixed expiration time, or it can be automatically deleted after the browser is closed. Session can set the expiration time on the server side, and can also be deleted after the user exits or closes the browser.
  • Save data type: Cookie can only store string type data, while Session can store any type of data.

75.What kind of information is stored in the cookie to determine whether the user is logged in (Session ID)

​​​​​​​​1.​​​​​​​​​​ Cookies are usually used to store a small amount of information, such as user ID, expiration time, etc. If the system is implemented based on Session, a Session ID is usually stored in Cookie so that the server can identify and verify the user's identity.

2. Determine whether the user is logged in by checking whether there is a valid Session ID in the cookie. If the Session ID has expired or been tampered with, you need to log in again.

76. After the cookie is obtained, can I access it from another machine?

1. If the user's cookie is obtained, it cannot be directly accessed by another machine.

2. Because the cookie is stored in the local storage of the client browser,

3. Different browsers use local storage on different machines, so they cannot be accessed directly.

4. But if the attacker can obtain the user's cookie, he can impersonate the user to perform some operations, such as conducting CSRF (cross-site request forgery) attacks, etc.

77. If the cookie is forged, what problems will it cause and how to solve it

  1. Use the HttpOnly attribute: After setting the HttpOnly attribute, the browser will prohibit JavaScript from accessing the cookie, reducing the risk of cookie being stolen by XSS injection attacks.

  2. Set the Secure attribute: After setting the Secure attribute, the cookie can only be transmitted through the HTTPS protocol, avoiding the risk of middlemen eavesdropping and tampering with data.

  3. Store sensitive information on the server side: Sensitive information should be stored on the server side, rather than stored in cookies or transmitted through the client. In this way, even if the cookie is forged, the attacker cannot obtain sensitive information.

  4. Reasonably set the expiration time of the cookie: setting the expiration time of the cookie reasonably can reduce the risk of being stolen to a certain extent.

  5. Use professional encryption technology: Use professional encryption technology to encrypt cookies to prevent data leakage and tampering.

In short, a variety of measures should be taken to protect cookie security, including setting HttpOnly, Secure attributes, storing sensitive information on the server side, reasonably setting cookie expiration time, and using professional encryption technology.

79. How to do traffic peak clipping?

Traffic peak clipping is a common system performance optimization technique, the purpose is to avoid system crash due to sudden high concurrent traffic as much as possible. Specific measures include the following aspects:

  1. Caching: Using caching can reduce the traffic pressure on the database or other backend services. Hot data can be placed in caches such as Redis to reduce the frequency of access to the database.

  2. Queue: Use the message queue to realize traffic peak shaving, persist the request in the message queue, and then consume according to certain rules. In this way, the traffic in a short period of time can be smoothly processed to avoid system downtime caused by instantaneous high concurrency.

  3. Current limiting: Current limiting can control the maximum number of concurrency of the system to prevent exceeding the system's carrying capacity. Algorithms such as token bucket and leaky bucket can be used to limit the flow of requests.

  4. Asynchronous processing: Asynchronously process some businesses with a large processing volume, such as image uploading, email sending and other operations. This can move a large number of calculations and I/O operations to the asynchronous thread, freeing the resources of the main thread.

  5. Layered architecture: For high-concurrency systems, layered architecture and micro-service architecture can be used to distribute tasks to different nodes to achieve load balancing.

Guess you like

Origin blog.csdn.net/weixin_64625868/article/details/131038746