The byte interview experience value is full~

Today, I will share a reader’s spring recruitment of byte two-sided face-to-face experience, and the position is back-end development.

No programming language was asked, but network + project + mysql + redis.

problem record

How to use message middleware to reduce the pressure of message persistence, and why can it be reduced?

Reader's answer: In the case of a large number of sudden messages, traffic peak cutting can be achieved, and it can also operate normally when the consumer's consumption capacity cannot reach the speed of the producer's message generation.

How to solve the message heap pressure on the message queue?

Reader Answers:

(1) In its own scenario, the message stack pressure is temporary, and the message stack pressure is only an emergency situation. Even if there is no additional processing, it can be consumed as time goes by.

(2) If there is a persistent message stack pressure, you can consider temporarily increasing the number of consumers to increase the consumption capacity of consumers.

Replenish:

If there is a sudden online problem, it is necessary to temporarily expand the capacity to increase the number of consumers, and at the same time, downgrade some non-core businesses. Take on traffic through capacity expansion and downgrade, which is to show your ability to deal with emergency problems. The second is to troubleshoot and solve abnormal problems, such as analyzing whether there is a problem with the business logic code on the consumer side through monitoring, logs, etc., and optimizing the business processing logic on the consumer side.

In concurrent programming, when locks are required, what problems will there be if there are no locks?

Reader's answer: Two threads will have inconsistencies when using the same global variable. For example, when thread a adds 1 to the global variable and thread b reads it, if it still reads from the cache, then the update will not be found, and it will generate Inconsistent issues.

How to avoid deadlock?

Reader's answer: Before using it, consider the conditions of deadlock: mutual exclusion access, possession and hold, circular waiting.

For the above points, it can be: resources are allocated at one time, can be interrupted when occupied, and the order of resource allocation is considered.

What are the HTTP status codes?

Reader Answers:

1xx information response, such as 100 means provisional response.

2xx succeeded.

3xx redirection, 301 permanent change, 308 temporary change (is it 308?) Oh no, it’s 307, I don’t remember very clearly.

4xx client problem, 404 not found resource not found, 403 client does not have access rights.

5xx server problem, 502 gateway problem.

Interviewer: 307 is actually a temporary version. It is a relatively new version. The old version is 302. Now the industry still uses the old version more often.

Replenish:

Five types of HTTP status codes

1xx The class status code belongs to the prompt information , which is an intermediate state in the protocol processing, and is rarely used in practice.

2xx Class status codes indicate that the server has successfully processed the client's request, which is the status we are most willing to see.

" 200 OK " is the most common success status code, indicating that everything is OK. If it is a non-  HEAD request, the response header returned by the server will have body data.

" 204 No Content " is also a common success status code, which is basically the same as 200 OK, but there is no body data in the response header.

" 206 Partial Content " is applied to HTTP chunked download or resumable upload, indicating that the body data returned by the response is not all of the resource, but a part of it, and it is also the status of the server's successful processing.

3xx The class status code indicates that the resource requested by the client has changed, and the client needs to resend the request to obtain the resource with a new URL, that is, redirect .

" 301 Moved Permanently " means permanent redirection, indicating that the requested resource no longer exists, and a new URL needs to be used to visit again.

" 302 Found " indicates a temporary redirection, indicating that the requested resource is still there, but it needs to be accessed by another URL temporarily.

Both 301 and 302 use fields in the response header  Locationto indicate the URL to be redirected later, and the browser will automatically redirect to the new URL.

" 304 Not Modified " does not have the meaning of jumping, indicating that the resource has not been modified, redirecting the existing buffer file, also known as cache redirection, that is, telling the client that the cache resource can continue to be used for cache control.

4xxThe class status code means that the message  sent by the client is wrong and the server cannot process it, which is the meaning of the error code.

" 400 Bad Request " indicates that there is an error in the message requested by the client, but it is only a general error.

" 403 Forbidden " means that the server forbids access to resources, not that the client's request is wrong.

" 404 Not Found " indicates that the requested resource does not exist or is not found on the server, so it cannot be provided to the client.

5xx The class status code indicates that the client request message is correct, but an internal error occurred during the server processing , which belongs to the server-side error code.

" 500 Internal Server Error " and 400 types are general and common error codes. We don't know what error occurred on the server.

" 501 Not Implemented " indicates that the function requested by the client is not yet supported, similar to "opening soon, please look forward to it".

" 502 Bad Gateway " is usually an error code returned by the server as a gateway or proxy, indicating that the server itself is working normally, and an error occurred when accessing the back-end server.

" 503 Service Unavailable " means that the server is currently busy and cannot respond to the client temporarily, similar to "the network service is busy, please try again later".

The difference between HTTP1.0 and 2.0, or the difference between 2.0 and 3.0?

Reader Answers:

(1) Every request he will establish a tcp connection with the server. After the request is processed, he will immediately disconnect the tcp, and he has a short tcp connection. In 1.1, he improved slightly.

(2) 2.0 provides a multiplexing function. There is also binary framing and header compression.

Replenish:

HTTP/2 performance improvements over HTTP/1.1:

head compression

binary format

concurrent transfer

The server actively pushes resources

1. Head Compression

HTTP/2 will compress the header (Header). If you send multiple requests at the same time, and their headers are the same or similar, then the protocol will help you eliminate the duplicate parts .

This is the so-called  HPACK algorithm: maintain a table of header information at the same time on the client and server, all fields will be stored in this table, an index number will be generated, and the same field will not be sent in the future, only the index number will be sent, which will increase the speed .

2. Binary format

HTTP/2 is no longer a plain text message like HTTP/1.1, but a binary format is adopted in an all-round way . The header information and data body are both binary, and are collectively called frames: Headers Frame and Data Frame .

HTTP/1 与 HTTP/2

Although this is not friendly to people, it is very friendly to computers, because computers only understand binary, so after receiving the message, there is no need to convert the plaintext message into binary, but directly parse the binary message, which increases data transmission . efficiency .

3. Concurrent transmission

We all know that the implementation of HTTP/1.1 is based on the request-response model. In the same connection, HTTP completes a transaction (request and response) before processing the next transaction. That is to say, in the process of sending a request and waiting for a response, there is no way to do other things. If the response is delayed, then the subsequent The request cannot be sent, which also causes the problem of head-of-line blocking .

And HTTP/2 is very powerful, leading to the concept of Stream, multiple Streams are multiplexed in a TCP connection.

img

As can be seen from the figure above, a TCP connection contains multiple Streams, and a Stream can contain one or more Messages. Messages correspond to requests or responses in HTTP/1 and consist of HTTP headers and packets. A Message contains one or more Frames. A Frame is the smallest unit of HTTP/2 and stores the content (header and body) in HTTP/1 in a binary compressed format.

Unique Stream IDs are used to distinguish different HTTP requests. The receiving end can orderly assemble HTTP messages through Stream IDs. Frames of different Streams can be sent out of order, so different Streams can be concurrently sent, that is, HTTP/2 can Send requests and responses interleaved in parallel .

For example, in the figure below, the server sends two responses in parallel and interleavedly: Stream 1 and Stream 3. These two Streams run on a TCP connection. After receiving them, the client will assemble them in order according to the same Stream ID . HTTP messages.

img

4. Server push

HTTP/2 also improves the traditional "request-response" working mode to a certain extent. The server no longer responds passively, but can actively send messages to the client.

Both the client and the server can create a Stream , and the Stream ID is also different. The Stream created by the client must be an odd number, while the Stream created by the server must be an even number.

For example, in the figure below, Stream 1 is the resource requested by the client to the server, which belongs to the Stream established by the client, so the ID of the Stream is an odd number (number 1); Stream 2 and 4 are resources actively pushed by the server to the client , belongs to the Stream created by the server, so the IDs of these two Streams are even numbers (numbers 2 and 4).

img

For another example, the client obtains an HTML file from the server through an HTTP/1.1 request, and the HTML may still need to rely on CSS to render the page. At this time, the client needs to initiate a request to obtain the CSS file again, which requires two round trips, as follows The left part of the picture:

img

As shown in the right part of the figure above, in HTTP/2, when the client accesses HTML, the server can directly actively push the CSS file, reducing the number of message transmissions.

What's wrong with HTTP/2?

HTTP/2 solves the problem of head-of-line blocking in HTTP/1 through the concurrency capability of Stream, which seems to be perfect, but HTTP/2 still has the problem of "head-of-line blocking", but the problem is not at the level of HTTP. But at the TCP layer.

HTTP/2 is based on the TCP protocol to transmit data. TCP is a byte stream protocol. The TCP layer must ensure that the received byte data is complete and continuous, so that the kernel will return the data in the buffer to the HTTP application. Then when the "first 1 byte of data" does not arrive, the last byte of data received can only be stored in the kernel buffer, and only when the 1 byte of data arrives can the HTTP/2 application layer receive data from the kernel. Get the data, this is the HTTP/2 head-of-line blocking problem.

img

For example, as shown below:

img

In the figure, the sender sent many packets, and each packet has its own serial number, which you can think of as the serial number of TCP. Among them, packet 3 was lost in the network, even after packets 4-6 were received by the receiver, due to The TCP data in the kernel is not continuous, so the receiver's application layer cannot read it from the kernel. Only after packet 3 is retransmitted, the receiver's application layer can read the data from the kernel. This is HTTP The head-of-line blocking problem of /2 occurs at the TCP level.

Therefore, once a packet loss occurs, the TCP retransmission mechanism will be triggered, so that all HTTP requests in a TCP connection must wait for the lost packet to be retransmitted .

We know earlier that both HTTP/1.1 and HTTP/2 have the problem of head-of-line blocking:

Although the pipeline (pipeline) in HTTP/1.1 solves the head-of-line blocking of the request, it does not solve the head-of-line blocking of the response , because the server needs to respond to the received requests in order. If the time consumed by the server to process a request is relatively If it is too long, the next request can only be processed after the response to this request is completed, which belongs to HTTP layer head-of-queue blocking.

Although HTTP/2 solves HTTP head-of-line blocking by multiplexing one TCP connection with multiple requests, once packets are lost, all HTTP requests will be blocked , which belongs to TCP layer head-of-line blocking.

The problem of head-of-line blocking in HTTP/2 is due to TCP, so  HTTP/3 changes the underlying TCP protocol of HTTP to UDP!

HTTP/1 ~ HTTP/3

UDP transmission is regardless of order or packet loss, so there will be no problem like HTTP/2 head-of-line blocking. Everyone knows that UDP is unreliable transmission, but the UDP-based  QUIC protocol  can achieve reliable transmission similar to TCP.

QUIC has the following three characteristics.

no head-of-line blocking

faster connection establishment

connection migration

1. No queue head blocking

The QUIC protocol also has concepts similar to HTTP/2 Stream and multiplexing. It can also transmit multiple Streams concurrently on the same connection. A Stream can be considered as an HTTP request.

QUIC has its own set of mechanisms to ensure the reliability of transmission. When packet loss occurs in a certain flow, only this flow will be blocked, and other flows will not be affected, so there is no head-of-line blocking problem . This is different from HTTP/2, where any packet loss in one flow affects other flows as well.

Therefore, there is no dependency between the multiple Streams on the QUIC connection, and they are all independent. If a stream loses packets, only this stream will be affected, and other streams will not be affected.

img

2. Faster connection establishment

For the HTTP/1 and HTTP/2 protocols, TCP and TLS are layered, which belong to the transport layer implemented by the kernel and the presentation layer implemented by the openssl library respectively, so they are difficult to merge together and need to be handshaked in batches, first TCP handshake , and then the TLS handshake.

Although HTTP/3 requires a QUIC protocol handshake before data transmission, the handshake process only requires 1 RTT. The purpose of the handshake is to confirm the "connection ID" of both parties, and the connection migration is realized based on the connection ID.

However, the QUIC protocol of HTTP/3 is not layered with TLS, but QUIC contains TLS inside, and it will carry the "record" in TLS in its own frame, and QUIC uses TLS/1.3, so only 1 RTT can "simultaneously" complete the connection establishment and key negotiation, as shown in the figure below:

TCP HTTPS(TLS/1.3) 和 QUIC HTTPS

Even, during the second connection, the application data packet can be sent together with the QUIC handshake information (connection information + TLS information) to achieve the effect of 0-RTT.

In the right part of the figure below, when the HTTP/3 session resumes, the payload data is sent together with the first data packet, which can achieve 0-RTT (the lower right corner of the figure below):

img

3. Connection Migration

The HTTP protocol based on the TCP transport protocol determines a TCP connection through a quadruple (source IP, source port, destination IP, destination port).

TCP quadruple

Then when the network of the mobile device is switched from 4G to WIFI, it means that the IP address has changed, so the connection must be disconnected and then re-established . The process of establishing a connection includes the delay of the TCP three-way handshake and the TLS four-way handshake, as well as the deceleration process of the TCP slow start. The user feels that the network suddenly freezes, so the connection migration cost is very high.

The QUIC protocol does not use a quadruple to "bind" the connection, but to mark the two endpoints of the communication through the connection ID  . The client and the server can each choose a set of IDs to mark themselves, so even if the network of the mobile device After the change, the IP address has changed. As long as the context information (such as connection ID, TLS key, etc.) The function of connection migration is added .

Oral answer topic: MySQL joint index

At the beginning, I gave a question about writing sql statements. I said that I usually use the orm framework to access the database and rarely write sql, so I changed the question of indexing.

topic:

index(abc)

1.select * from T where a=x and b=y and c=z

2.select * from T where a=x and b>y and c=z

3.select * from T where c=z and a=x and b=y

4.select (a,b) from T where a=x and b>y

5.select count(*) from T where a=x

6.select count(*) from T where b=y

7.select count(*) form T

Reader Answers:

(1) According to the leftmost matching principle, go to the index, first match a, then b, and then c.

(2) a goes to the index, b because it is ordered in the index, it can still go to the index, and c needs to go back to the table for query.

(3) Because the mysql queryer is optimized, it is the same as (1).

(4) A goes to the index, and b also goes to the index, because ab is on the index, and there is no need to return to the table in the end.

(5) Walk the index.

(6) Only b, the joint index cannot be used, and it needs to be returned to the table for query.

(7) There is no need to return the form.

Replenish:

1. The three fields a, b, and c can all use the joint index

2. Both a and b will use the joint index, but due to the leftmost matching principle, the fields behind the range search cannot use the joint index, but after mysql version 5.6, although the c field cannot use the joint index, but because there is an index push down The characteristics of the c field are returned to the server layer for table return after the inndob layer filters the records that meet the query conditions. Compared with no index pushdown, the number of table return is reduced.

3. The order of the query conditions does not affect, and the optimizer will optimize, so the three fields a, b, and c can use the joint index

4. Both a and b will use the joint index, the query is a covering index, and there is no need to return to the table

5.a You can use the joint index

6. There is only b, and the joint index cannot be used. Since the table has a joint index, the scanning method selected by count(*) is to scan the joint index to count the number. The scanning method is type=index

7. Since the table has a joint index, the scanning method selected by count(*) is to scan the joint index to count the number, and the scanning method is type=index

The reason why count(*) chooses to scan the joint index (secondary index) instead of the clustered index: this is because the same number of secondary index records can occupy less storage space than the clustered index records, so two The level index tree is smaller than the clustered index tree, so the I/O cost of traversing the secondary index is smaller than the I/O cost of traversing the clustered index, so the "optimizer" prefers the secondary index.

How Redis implements distributed locks

Reader's answer: The application obtains the connection of redis, and then sends a command to acquire the lock. (Interviewer: What command?) SETNX. Because this command is an atomic command in Redis, you can set a key to its corresponding value. Then at this time, the application starts to go to the critical section to perform some operations. If so, it seems that he failed to carry out that order. It means that the lock is already held by other clients, so he needs to wait. After successfully acquiring the lock and executing the critical section operation, you can use the del command to delete the key in redis to achieve the purpose of releasing the lock.

Replenish:

The SET command of Redis has an NX parameter that can realize "insert when the key does not exist", so it can be used to implement distributed locks:

1. If the key does not exist, it will show that the insertion is successful, which can be used to indicate that the locking is successful;

2. If the key exists, the insertion failure will be displayed, which can be used to indicate the failure of locking.

When implementing distributed locks based on Redis nodes, we need to meet three conditions for locking operations.

1. Locking includes three operations of reading the lock variable, checking the value of the lock variable and setting the value of the lock variable, but it needs to be done in an atomic operation, so we use the SET command with the NX option to achieve locking;

2. The lock variable needs to set an expiration time, so as to prevent the client from getting an exception after obtaining the lock, resulting in the lock being unable to be released. Therefore, we add the EX/PX option when the SET command is executed to set its expiration time;

3. The value of the lock variable needs to be able to distinguish the locking operations from different clients, so as to avoid accidental release operations when releasing the lock. Therefore, when we use the SET command to set the value of the lock variable, the value set by each client is a unique value, used to identify the client;

A distributed command that satisfies these three conditions is as follows:

SET lock_key unique_value NX PX 10000 
  • lock_key is the key key;

  • unique_value is the unique identifier generated by the client to distinguish lock operations from different clients;

  • NX stands for setting the lock_key only when the lock_key does not exist;

  • PX 10000 indicates that the expiration time of lock_key is set to 10s, which is to prevent the client from being abnormal and unable to release the lock.

The unlocking process is to delete the lock_key key (del lock_key), but it cannot be deleted randomly. It is necessary to ensure that the client performing the operation is the locked client. Therefore, when unlocking, we must first determine whether the unique_value of the lock is the locking client, and if so, delete the lock_key key.

It can be seen that there are two operations for unlocking. At this time, Lua scripts are needed to ensure the atomicity of unlocking, because Redis can execute Lua scripts atomically, which ensures the atomicity of lock release operations.

// 释放锁时,先比较 unique_value 是否相等,避免锁的误释放
if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end

In this way, the locking and unlocking of the distributed lock is completed on the Redis single node by using the SET command and the Lua script.

Thinking question: Toss a coin, the first toss to the head counts as a win, otherwise it will be tossed in turn. Ask the probability that the person who throws first wins.

Reader Answer: Abba Abba...Consider a round, (positive, -) (negative, positive) (negative, negative), in which (positive, -) can be reduced to (positive, positive) (positive,) two , and (reverse, reverse) is equivalent to doing it again, so (positive, -) takes 2 shares, (reverse, positive) takes 1 share, and the probability of winning is 2/3

Algorithm question: Find the peak

Given an input array nums where nums[i] != nums[i+1], find the peak element and return its index.

The array may contain multiple peaks, in which case it is sufficient to return the location of any one of the peaks.

You can assume nums[-1] = nums[n] = -∞.

Reader Answers:

(1) Traversing the increasing sequence and returning the first non-incrementing position, the time complexity is O(n).

(2) To find the maximum value, the time complexity is still O(n).

(3) Binary search variant. During binary search, take the middle position m and compare it with its adjacent position m+1. If m is greater than m+1, it means that the peak should be on the left, otherwise it should be on the right. Move the corresponding left and right boundaries.

rhetorical question

(1) During the interview process, what problems do I have, and what aspects do I need to strengthen my study? Or if I want to engage in back-end development, what knowledge do I still lack?

Interviewer: You can do more research on the database part. The microservice part may also be studied, because like I just asked if you can talk about microservices, I actually hope you can understand. If you understand, I will ask questions about this aspect.

(2) If I have the opportunity to participate in the internship, what kind of content will I probably do?

Interviewer: In terms of the backend, we are doing B2B, and our product form will be a bit similar to the database. It may be better if you can go in-depth and understand the implementation and design of the database.

interview summary

Feel

I feel that the overall communication is relatively smooth, and the interviewer is very kind. He can guide me well when I answer questions or get stuck, and I feel very comfortable.

Inadequacies

1. The answers to some unprepared questions are not very good, such as comparing http1.0 and 2.0, the answers are very unorganized.

2. There are too many unnecessary modal particles and nonsense during the answering process, which makes people feel bad about the questions that they are not sure about.

Guess you like

Origin blog.csdn.net/JACK_SUJAVA/article/details/130153851