Mihayou's spring recruitment practice interview, the questions are very basic

Mihayou's spring recruiting internship interview mainly inspected the four aspects of java + operating system + mysql + network.

The interview process is 1 hour in total, with 1 minute for self-introduction, 20 minutes for writing questions, and the remaining questions for basic knowledge.

Java

String, StringBuilder, StringBuffer difference? Which one is used for single-threaded massive operations on strings?

Answer: use StringBuilder

Replenish:

String, StringBuilder, and StringBuffer are all classes in Java for manipulating strings.

String is an immutable sequence of characters. Every time String is modified, a new String object will be created. Therefore, when operating a large number of strings, using String will frequently create objects, resulting in low performance.

Both StringBuilder and StringBuffer are mutable character sequences that can be modified multiple times without creating new objects. The difference between the two lies in thread safety, StringBuffer is thread safe, and StringBuilder is not thread safe. Because all public methods of StringBuffer are synchronized, using StringBuffer in a multi-threaded environment can ensure thread safety, but it will reduce performance. StringBuilder has no synchronization method, so using StringBuilder in a single-threaded environment has higher performance.

Therefore, when performing a large number of string operations in a single-threaded environment, StringBuilder should be used to obtain better performance. In a multi-threaded environment, using StringBuffer can guarantee thread safety, but it will sacrifice certain performance.

To sum up, StringBuilder should be used when a single thread operates a large number of strings, and StringBuffer should be used in a multi-threaded environment.

Can synchronized biased locks be directly upgraded to heavyweight locks? How is the heavyweight lock implemented?

Answer: I pulled the synchronized four locks

Replenish:

Biased locks will not be directly upgraded to heavyweight locks, but will be upgraded to lightweight locks first, and then upgraded to heavyweight locks if the competition of lightweight locks fails.

The implementation of heavyweight locks is generally implemented through the mutex (mutex) of the operating system. When a thread acquires a heavyweight lock, the thread is suspended until the lock is released. The performance of this kind of lock is relatively low, because every time locking and releasing the lock needs to involve the system call of the operating system, there will be a large overhead. Therefore, in practical applications, try to avoid using heavyweight locks.

Exceptions in Java

Answer: I can't remember the classification. Talk about exception capture, from\to\target pointer

Replenish:

When an exception occurs in the program, Java throws an exception object. Exceptions in Java can be divided into three categories:

1. Checked Exception: This kind of exception can be detected at compile time, and must be processed in the code or declared to be thrown, otherwise the compilation will fail. This type of exception is mainly caused by the external environment of the program, such as file does not exist, network connection failure, etc. Common Checked Exceptions include IOException, SQLException, etc.

2. Unchecked Exception (unchecked exception): This kind of exception is usually caused by internal program errors, such as NullPointerException, ArrayIndexOutOfBoundsException, IllegalArgumentException, etc. This type of exception does not need to be declared in the code to be thrown, or it can not be handled, but if it is not handled, the program will crash.

3. Error: This exception is usually caused by JVM or hardware, such as OutOfMemoryError, StackOverflowError, etc. This type of exception does not need to be declared in the code to be thrown, and it may not be handled, but if it is not handled, the program will crash.

In Java, exception handling usually includes try-catch statement and throw statement. The try-catch statement can catch exceptions and handle them, while the throw statement can manually throw exceptions.

operating system

How to communicate between processes? Classification of pipeline models?

The easiest way is pipes , which are divided into "anonymous pipes" and "named pipes".

As the name implies, anonymous pipes have no name identifier. Anonymous pipes are special files that only exist in memory and not in the file system. The " " |vertical line in the shell command is an anonymous pipe. The communication data is an unformatted stream and its size is limited. , the communication method is one-way, and data can only flow in one direction. If two-way communication is required, two pipes need to be created. Then, anonymous pipes can only be used for inter-process communication with parent-child relationships. The life cycle of anonymous pipes It is established when the process is created and disappears when the process is terminated.

Named pipes break through the limitation that anonymous pipes can only communicate between related processes, because the premise of using named pipes is to create a device file of type p in the file system, then unrelated processes can use this device file. communication. In addition, whether it is an anonymous pipe or a named pipe, the data written by a process is cached in the kernel. When another process reads data, it is naturally obtained from the kernel. At the same time, the communication data follows the first-in-first-out principle.

The message queue overcomes the problem that the data of pipeline communication is an unformatted byte stream. The message queue is actually stored in the "message list" of the kernel. The message body of the message queue is a data type that can be customized by the user. When sending data, It will be divided into independent message bodies one by one. Of course, when receiving data, it must be consistent with the data type of the message body sent by the sender, so as to ensure that the read data is correct. The speed of message queue communication is not the most timely. After all, each data writing and reading needs to go through the copy process between user mode and kernel mode.

Shared memory can solve the overhead caused by the data copy process between user mode and kernel mode in message queue communication. It directly allocates a shared space, which can be directly accessed by each process, which is as fast and convenient as accessing the process's own space. It needs to fall into the kernel mode or system call, which greatly improves the communication speed and enjoys the name of the fastest inter-process communication method . However, convenient and efficient shared memory communication brings new problems. Multi-process competition for the same shared resource will cause data confusion.

Then, a semaphore is needed to protect shared resources to ensure that only one process can access shared resources at any time. This method is mutually exclusive access. The semaphore can not only realize the mutual exclusion of access, but also realize the synchronization between processes. The semaphore is actually a counter, which indicates the number of resources. Its value can be controlled by two atomic operations, namely P operation and V operate.

A signal that is very similar to a semaphore name is called a signal . Although they have similar names, their functions are not the same at all. The signal can directly interact between the application process and the kernel, and the kernel can also use the signal to notify the process of the user space which system events have occurred. The sources of signal events mainly include hardware sources (such as keyboard Cltr+C ) and software sources (such as kill command), once a signal occurs, the process has three ways to respond to the signal 1. Perform the default operation, 2. Capture the signal, 3. Ignore the signal. There are two signals that cannot be caught and ignored by the application process, namely  SIGKILL and  SIGSTOP, which is for the convenience that we can end or stop a certain process at any time.

The communication mechanisms mentioned above all work on the same host. If you want to communicate with processes of different hosts, you need  Socket communication . Socket is actually not only used for inter-process communication of different hosts, but also for inter-process communication of local hosts. It can be divided into three common communication methods according to the type of Socket created, one is based on the TCP protocol, and the other is One is a communication method based on the UDP protocol, and the other is a local inter-process communication method.

What is the difference between kernel mode and user mode? What are the underlying operations of the kernel state? Why are there two different states?

Kernel mode and user mode are two operating modes in the operating system. They mainly differ in permissions and actions that can be performed:

1. Kernel Mode: In kernel mode, the CPU can execute all instructions and access all hardware resources. Operations in this mode have higher privileges and are mainly used to run the operating system kernel.

2. User Mode: In user mode, the CPU can only execute part of the instruction set and cannot directly access hardware resources. This mode has lower operating authority and is mainly used to run user programs.

The underlying operations of the kernel state mainly include: memory management, process management, device driver control, system calls, etc. These operations involve the core functions of the operating system and require higher privileges to execute.

The main reasons for dividing into kernel mode and user mode are as follows:

1. Security: Through the division of permissions, user programs cannot directly access hardware resources, thereby avoiding damage to system resources by malicious programs.

2. Stability: When there is a problem with the user state program, it will not affect the entire system, avoiding the risk of system crash caused by program failure.

3. Isolation: The division of kernel mode and user mode makes a clear boundary between the operating system kernel and user programs, which is conducive to system modularization and maintenance.

The division of kernel mode and user mode helps to ensure the security, stability and maintainability of the operating system.

mysql

A sentence, how to add an index is better?

In MySQL, creating indexes can improve query performance. To add an index to a column, we can use the following statement:

CREATE INDEX index_name ON table_name(column_name);

Here, index_nameis the name you give the index, table_nameis the name of the table you want to add the index to, column_nameand is the name of the column you want to add the index to.

Consider the following when choosing which columns to index:

1. For columns that are often used in query conditions, adding indexes can improve query speed.

2. For columns with many repeated values, the performance improvement of adding indexes may not be obvious.

3. Try to avoid creating too many indexes on very large tables, because this will affect the performance of insert and update operations.

What is a joint index?

By combining multiple fields into an index, the index is called a composite index.

For example, to combine the product_no and name fields in the product table into a joint index (product_no, name), the way to create a joint index is as follows:

CREATE INDEX index_product_no_name ON product(product_no, name);

The B+Tree schematic diagram of the joint index (product_no, name) is as follows (I drew a one-way linked list between the leaf nodes in the picture, but it is actually a two-way linked list. I can’t find the original picture, and I can’t modify it. I’m lazy and I won’t redraw it. Everyone thinks Just make up a doubly linked list).

joint index

It can be seen that the non-leaf nodes of the joint index use the values ​​of the two fields as the key values ​​of the B+Tree. When querying data in the joint index, first compare by the product_no field, and then compare by the name field if the product_no is the same.

That is to say, the B+Tree of the joint index query is first sorted by product_no, and then sorted by the name field if the product_no is the same.

Therefore, when using a joint index, there is a leftmost matching principle , that is, index matching is performed in a leftmost-first manner. When using a joint index for query, if the "leftmost matching principle" is not followed, the joint index will fail, so that the fast query feature of the index cannot be used.

Four isolation levels of mysql?

  • Read uncommitted (read uncommitted) , which means that when a transaction has not been committed, the changes it makes can be seen by other transactions;

  • Read committed (read committed) , which means that after a transaction is committed, the changes it makes can be seen by other transactions;

  • Repeatable read (repeatable read) , refers to the data seen during the execution of a transaction, which is always consistent with the data seen when the transaction is started, the default isolation level of the MySQL InnoDB engine ;

  • Serializable : A read-write lock will be added to the record. When multiple transactions read and write this record, if a read-write conflict occurs, the later accessed transaction must wait for the previous transaction to complete , to continue execution;

For different isolation levels, the phenomena that may occur during concurrent transactions will also be different.

picture

That is to say:

  • Under the "read uncommitted" isolation level, dirty reads, non-repeatable reads, and phantom reads may occur;

  • Under the "read committed" isolation level, non-repeatable reads and phantom reads may occur, but dirty reads are impossible;

  • Under the "repeatable read" isolation level, phantom reads may occur, but dirty reads and non-repeatable reads are impossible;

  • Under the "serialization" isolation level, dirty reads, non-repeatable reads, and phantom reads are impossible.

What are dirty reads, phantom reads, and non-repeatable reads?

  • Dirty read: If a transaction "reads" another "data modified by an uncommitted transaction", it means that a "dirty read" phenomenon has occurred.

  • Non-repeatable read: The same data is read multiple times within a transaction. If the data read twice before and after is different, it means that the phenomenon of "non-repeatable read" has occurred.

  • Phantom reading: A "number of records" that meets the query conditions is queried multiple times in a transaction. If the number of records queried twice before and after is different, it means that a "phantom reading" phenomenon has occurred.

How does mysql's innodb avoid non-repeatable reads?

The default isolation level of mysql is repeatable read. After the transaction is started, when the first select statement is executed, a Read View will be generated. This Read View will be used in the select during the entire transaction later, so the transaction The data read during the period is consistent, and there will be no inconsistency between the data read before and after, so non-repeatable reading is avoided.

network

What happened after entering the URL?

Answer: DNS resolution at the application layer, TCP connection at the transport layer, IP at the network layer, MAC at the data link, and real physical layer.

The specific process of DNS resolution

DNS domain name resolution, simply put, is to translate domain names into IP addresses. For example: translate the domain name www.baidu.com into the corresponding IP 220.181.38.251, here is just an example.

 

Domain name resolution process

The above figure introduces the process of domain name resolution in 8 steps, but before that, the local cache configuration + hosts resolution will be checked first, and then the process in the above figure will be actually executed:

First, before the dns server resolves, the cache will be checked. There are two cached queries in total:

Browser cache check: The browser will first search the browser's own DNS cache. The cache time is relatively short, only about 1 minute, and can only accommodate 1000 caches. Check whether there is a corresponding entry in its own cache, and it has not expired. If it exists and has not expired, the parsing ends here.

Operating system cache check + hosts resolution: If no corresponding entry is found in the browser cache, the operating system will also have a domain name resolution process, then the browser first searches the DNS cache of the operating system to see if there is a resolution result corresponding to the domain name , if it is found and has not expired, stop searching, and the parsing ends here. In Linux, it can be set through the /etc/hosts file, and any domain name can be resolved to any accessible IP address. If an IP address corresponding to a domain name is specified here, the browser will use this IP address first. When a domain name in this configuration file is resolved, the operating system will cache the resolution result in the cache, and the cache time is also controlled by the expiration time of the domain name and the size of the cache space.

Then perform dns analysis:

Step 1: The client accesses the website with the domain name www.baidu.com (http://www.baidu.com) through the browser, and initiates a DNS request to query the IP address of the domain name. The request is sent to the local DNS server. The local DNS server will first query its cache record, and if there is this record in the cache, it can directly return the result. If not, the local DNS server will also query the DNS root server.

Step 2: The local DNS server sends a DNS request to the root server, requesting the IP address whose domain name is www.baidu.com (http://www.baidu.com).

Step 3: After the root server has been queried, there is no record of the corresponding relationship between the domain name and the IP address. But it will tell the local DNS server that you can go to the domain name server to continue querying, and give the address of the domain name server (.com server).

Step 4: The local DNS server sends a DNS request to the .com server, requesting the IP address of the domain name www.baidu.com (http://www.baidu.com).

Step 5: After receiving the request, the com server will not directly return the corresponding relationship between the domain name and the IP address, but tell the local DNS server that the domain name can be resolved on the baidu.com domain name server to obtain the IP address, and tell baidu. The address of the com domain name server.

Step 6: The local DNS server sends a DNS request to the baidu.com domain name server, requesting the IP address of the domain name www.baidu.com (http://www.baidu.com).

Step 7: After receiving the request, the baidu.com server finds the correspondence between the domain name and the IP address in its cache table, and returns the IP address to the local DNS server.

Step 8: The local DNS server returns the acquired IP address corresponding to the domain name to the client, and saves the corresponding relationship between the domain name and the IP address in the cache for use when other users query next time.

Finally, I quote a picture from the big guy as a summary:

 

Causes of TCP unpacking and sticking

The phenomenon of TCP unpacking and dipping is caused by the characteristics of the TCP protocol and various factors in the network transmission process.

1. The TCP protocol is a transport layer protocol based on byte streams, and there is no fixed packet boundary. The sender divides the data into multiple small data packets for transmission, and the receiver combines these data packets into complete data. During this process, unpacking and sticking may occur.

2. Delay and congestion in network transmission will affect the speed at which data packets are sent and the order in which they arrive at the receiver. This may lead to irregular splitting and combination of data packets, resulting in unpacking and dipping.

3. The receiver's buffer size limit. When the receiver's buffer is not large enough to accommodate a complete packet, it may split the packet into multiple parts, resulting in unpacking.

In order to solve the problem of TCP unpacking and dipping, the following methods can be used:

4. Realize the boundary recognition of the data packet at the application layer, for example, by adding a header, which contains information such as the length of the data packet, so that the receiver can accurately splice the data packets.

5. Use fixed-length data packets or special delimiters so that the receiver can identify the boundaries of data packets.

6. Use a more advanced transport layer protocol, such as WebSocket, which adds the concept of data frames on the basis of TCP, which can better solve the problem of unpacking and dipping.

algorithm

Handwritten LRU

interview experience

According to the resume module, I picked the stereotyped questions, and the stereotypes were also very basic. I didn't ask about the project, so I entered the second side.

Guess you like

Origin blog.csdn.net/JACK_SUJAVA/article/details/130582650