Byte Kuang asked for 1 hour, the guy got the offer, it's too ruthless! (Byte interview questions)

Foreword:

In Nien's (50+) reader community , there are often small partners who need to interview big companies such as Toutiao, Meituan, Ali, and JD.com.

The following is a small partner who successfully got the Byte Feishu offer and passed the one-hour torture interview experience, just two words:

  • deep: very deep
  • wide: very wide

In short, it is the interviewer of Toutiao, and his skills are quite good.

However, our candidates are not vegetarians either.

Next, from the guy's interview experience, what do you need to learn to accept a Feishu Offer? Of course, this small partner is an intermediate development, but for intermediate and advanced development, these interview questions are also of reference significance.

The questions and reference answers are also included here in our " Nin Java Interview Collection " V69, for the reference of the following friends, to improve everyone's 3-high architecture, design, and development levels.

Note: This article is continuously updated in PDF. For PDF files of the latest Nien architecture notes and interview questions, please go to the official account [Technical Freedom Circle] at the end of the article to obtain

Feishu interview topic:

1. What is Spring circular dependency? How does Spring solve it?

Spring circular dependency refers to the situation where two or more beans depend on each other to form a circular dependency. For example, Bean A depends on Bean B, and Bean B depends on Bean A, thus forming a circular dependency.

Spring's way to solve circular dependencies is to use the "early exposure" and "three-level cache" mechanisms.

Specifically, when Spring creates a Bean, it will first put the Bean being created into the "first-level cache", and then check whether the Bean depends on other Beans. If so, Spring will put the Bean's dependencies into " Second-level cache" and create these dependent beans.

If the dependent beans also depend on the current bean, Spring will put these dependencies into the "three-level cache" and create these dependent beans.

When all beans are created, Spring will take these beans out of the "three-level cache" and inject them into the corresponding beans to complete the resolution of circular dependencies.

In addition, Spring also provides a variety of solutions to avoid circular dependency problems, such as using constructor injection, using setter method injection, using @Lazy annotation, etc.

It should be noted that Spring's mechanism for resolving circular dependencies is not perfect, because it needs to use the "three-level cache" mechanism, which will occupy a certain amount of memory space.

At the same time, if there are complex dependencies in the circularly dependent Bean, it may cause Spring to be unable to solve the problem of circular dependency, resulting in program exceptions. Therefore, when writing code, you should try to avoid circular dependencies.

2. How to calculate the minimum common string with high performance

The least common string problem is a classic computational problem in which the goal is to find the shortest substring in two strings that occurs in both strings. This problem can be solved with a variety of algorithms, some of which can achieve high-performance computing.

One such high-performance algorithm is the suffix tree algorithm .

A suffix tree is a special data structure that can be used to represent all suffixes of a string. In a suffix tree, each node represents a string's suffix, and each edge represents a character.

The minimum common string can be found by searching the suffix tree for common substrings of two strings.

Another high-performance algorithm is an algorithm based on dynamic programming .

This algorithm uses a two-dimensional array to record the common length of all substrings of the two strings. The smallest common string can be found by searching for the smallest common string in this array.

Regardless of the algorithm used, performance can be improved by using parallel computing.

For example, a string can be split into substrings and the common string between the substrings computed in parallel on multiple processors. This approach can greatly increase the computational speed and scales well to the case of dealing with large amounts of data.

The minimum common string problem can be solved with the following steps:

  1. Count the number of occurrences of each string in the string collection and save it in a dictionary.
  2. For each string, find its longest common prefix, which is the smallest common string of other strings.
  3. If the current string is the same as its longest common prefix, increment its occurrence count, otherwise update the longest common prefix.
  4. The resulting minimum common string is the longest common prefix among all strings.

The following is a Java code example that uses the suffix tree algorithm to achieve the minimum common string:

public class SuffixTree {
    
    
    private final Node root;

    public SuffixTree(String s) {
    
    
        root = new Node();
        for (int i = 0; i < s.length(); i++) {
    
    
            insertSuffix(s.substring(i), i);
        }
    }

    private void insertSuffix(String suffix, int index) {
    
    
        Node node = root;
        for (char ch : suffix.toCharArray()) {
    
    
            if (!node.containsKey(ch)) {
    
    
                node.put(ch, new Node());
            }
            node = node.get(ch);
        }
        node.addIndex(index);
    }

    public String findLCS(String s1, String s2) {
    
    
        String lcs = "";
        Node node = root;
        int i = 0, j = 0;
        while (i < s1.length() && j < s2.length()) {
    
    
            char ch1 = s1.charAt(i);
            char ch2 = s2.charAt(j);
            if (node.containsKey(ch1) && node.containsKey(ch2)) {
    
    
                node = node.get(ch1);
                i++;
                j++;
            } else {
    
    
                break;
            }
            if (node.hasMultipleIndexes()) {
    
    
                String candidate = findLCS(s1.substring(i - node.getIndexList().get(0), i), s2.substring(j - node.getIndexList().get(0), j));
                if (candidate.length() > lcs.length()) {
    
    
                    lcs = candidate;
                }
            }
        }
        return lcs;
    }

    private static class Node {
    
    
        private final Map<Character, Node> children = new HashMap<>();
        private final List<Integer> indexList = new ArrayList<>();

        public void put(char ch, Node node) {
    
    
            children.put(ch, node);
        }

        public boolean containsKey(char ch) {
    
    
            return children.containsKey(ch);
        }

        public Node get(char ch) {
    
    
            return children.get(ch);
        }

        public void addIndex(int index) {
    
    
            indexList.add(index);
        }

        public boolean hasMultipleIndexes() {
    
    
            return indexList.size() > 1;
        }

        public List<Integer> getIndexList() {
    
    
            return indexList;
        }
    }
}

Using this suffix tree implementation class, the smallest common string of two strings can be found by:

String s1 = "abcdefg";
String s2 = "bcdefgh";
SuffixTree suffixTree = new SuffixTree(s1 + "#" + s2);
String lcs = suffixTree.findLCS(s1, s2);
System.out.println(lcs); // 输出 "bcdef"

The time complexity of this algorithm is O(m+n), where m and n are the lengths of the two strings, respectively. Since the construction and search of the suffix tree can be implemented using efficient algorithms, this algorithm can achieve high-performance computing.

3. RPC framework

1) What is an RPC framework? In the RPC framework, how is the RPC communication protocol designed?

RPC (Remote Procedure Call) is a remote call protocol that allows a computer program to call a subroutine of another computer program without requiring the programmer to explicitly write the code for the remote call. The RPC framework is a software framework that implements the RPC protocol and provides an easy way to communicate across networks.

Complete RPC Architecture Diagram

In the RPC framework, the design of the RPC communication protocol usually includes the following aspects:

  1. Transmission protocol: The RPC framework needs to select a reliable transmission protocol to ensure data transmission. Commonly used transport protocols are TCP and UDP.
  2. Serialization protocol: The RPC framework needs to choose a serialization protocol to serialize data into a binary format for transmission over the network. Commonly used serialization protocols include JSON, XML, Protobuf, etc.
  3. Service registration and discovery: The RPC framework needs to provide service registration and discovery functions so that clients can find available service providers. Commonly used service registration and discovery tools include Zookeeper, Consul, etc.
  4. Load balancing: The RPC framework needs to provide load balancing functions so that requests can be distributed to different service providers in a balanced manner. Commonly used load balancing algorithms include round robin, random, etc.
  5. Security authentication: The RPC framework needs to provide the function of security authentication to ensure the security of communication. Common security authentication methods include SSL and Token authentication.
  6. Peer nodes: RPC frameworks often need to support multiple peer nodes, each of which can provide remote services. Nodes can exchange messages through network communication and pass responses between peer nodes.
  7. Call process: The RPC framework usually provides a set of call processes so that programmers can make remote calls conveniently. The call process usually includes parts such as request, response and error handling.

When designing the RPC communication protocol, factors such as communication reliability, efficiency, and security need to be considered, and an appropriate protocol needs to be selected according to the specific application scenario.

2) Designing an RPC framework, what issues need to be considered

Designing an RPC framework requires consideration of the following issues:

  1. Communication protocol: Choose an appropriate communication protocol to realize the communication between the client and the server. Commonly used communication protocols include HTTP, TCP, UDP, etc., and serialization protocols, such as JSON or Protobuf.
  2. Service registration and discovery: It is necessary to implement a service registration and discovery mechanism so that the client can automatically discover available service providers.
  3. Load balancing: It is necessary to implement a load balancing mechanism to ensure that requests can be distributed to different service providers in a balanced manner. Commonly used load balancing algorithms include round robin, random, and least number of connections.
  4. Security authentication: In order to ensure the security of the system, authentication and permission control are required for requests to prevent malicious attacks and illegal operations. Common security authentication methods include token-based authentication and SSL/TLS-based encrypted communication.
  5. Exception handling: It is necessary to consider the exception handling mechanism, such as the handling of network exceptions and timeouts.
  6. High availability: It is necessary to consider how to ensure the high availability of the system, such as implementing mechanisms such as service degradation and fault tolerance.
  7. Performance optimization: In the case of high concurrency, it is necessary to optimize the performance of the RPC framework to ensure that requests can be responded quickly. Commonly used performance optimization methods include mechanisms such as caching, asynchronous processing, and thread pools.
  8. Logging and monitoring: Logging and monitoring mechanisms need to be implemented in order to detect and resolve problems in a timely manner.
  9. Compatibility: Compatibility issues between different languages ​​and platforms need to be considered, such as the mechanism for implementing cross-language calls.
  10. Scalability: It is necessary to consider how to realize the scalability of the system, such as supporting dynamic addition and deletion of service providers.

3) In the RPC framework, the comparison and analysis of advantages and disadvantages of serialization algorithms

In the RPC framework, the serialization algorithm is a very important part, which directly affects the performance and scalability of the system. The following is a comparative analysis of several common serialization algorithms:

1. Java native serialization :

It is the serialization method that comes with Java. It can serialize objects into byte streams and deserialize byte streams into objects.

advantage

  • Easy to use, no additional dependencies required

shortcoming

  • The serialized byte stream is large, and the performance of serialization and deserialization is poor
  • Can only be used on the Java platform

2. JSON serialization :

It is a lightweight data exchange format that can serialize objects into JSON strings and deserialize JSON strings into objects.

advantage:

  • JSON is a lightweight data exchange format. The serialized data is small and easy to read and write.
  • JSON supports nested objects and arrays, making it easy to represent complex data structures.
  • JSON can be transmitted via the HTTP protocol and is suitable for web applications.

shortcoming:

  • The performance of serialization and deserialization is poor, does not support binary data, and can only serialize simple data structures such as JavaBean.
  • The parsing speed of JSON is relatively slow, and there may be performance problems for the processing of large amounts of data.

3. XML serialization :

XML serialization is a serialization method based on XML format, which can serialize objects into XML strings and deserialize XML strings into objects.

advantage:

  • XML is a common data format that can be used for data exchange between various applications.
  • Serialized data is easy to read and debug, and supports complex data structures
  • XML supports namespaces and attributes to facilitate the representation of complex data structures.
  • XML can be transmitted through the HTTP protocol, suitable for Web applications.

shortcoming:

  • Compared with JSON, XML is more verbose, the data is larger, and the efficiency of serialization and deserialization is lower.
  • XML does not support binary data and cannot directly serialize and deserialize binary data.

4. Protobuf serialization :

Protobuf is an efficient binary serialization protocol that can serialize objects into binary data and deserialize binary data into objects.

advantage:

  • Protobuf supports fast serialization and deserialization of data, the performance is very high, and the serialized data is small.
  • Protobuf supports multiple programming languages, including C++, Java, Python, etc., and supports cross-language calls.
  • Protobuf can define custom message types to facilitate the representation of complex data structures.

shortcoming:

  • Compared with JSON and XML, Protobuf is more complicated, and the cost of learning and using is higher.
  • Protobuf does not support markup languages ​​such as HTML and XML, and cannot be used directly in web applications.
  • Need to define IDL file, does not support operations such as adding fields dynamically

To sum up, different serialization algorithms have their own advantages and disadvantages, and a suitable serialization algorithm needs to be selected according to the specific application scenario.

If you need high-performance serialization and deserialization of binary data, you can choose Protobuf;

If you need a serialization format that is easy to read and debug, you can choose JSON or XML;

If you need to use the RPC framework in your web application, you can choose JSON;

If data exchange between multiple applications is required, XML may be more suitable;

If you need a serialization method natively supported by the Java platform, you can choose Java native serialization.

4) Have you learned about gRPC? What is the principle of gRPC

gRPC is a high-performance, open-source and general-purpose RPC framework developed by Google.

It uses Protocol Buffers as an interface description language, which can be used in many languages, including Java, Python, C++, etc. gRPC supports multiple transport protocols and serialization protocols, and can be used in different environments, such as cloud, mobile devices, browsers, etc.

The principle of grpc is based on HTTP/2 and protobuf protocol, using protobuf serialization and deserialization technology to realize remote procedure call.

protobuf is a lightweight, efficient, and extensible data serialization format developed by Google and open sourced. It allows transferring and parsing data between different platforms and languages, supports type definition and version control, and has the advantages of high data compression efficiency and fast serialization and deserialization speed.

gRPC implements remote procedure calls by establishing a protobuf serialization/deserialization channel between the client and the server. The client serializes the request into a byte stream and sends it to the server, and the server deserializes the response into a byte stream and sends it to the client. gRPC also supports functions such as load balancing, service discovery mechanism, authentication and authorization, monitoring and logging, which improves the reliability and scalability of the RPC framework.

4. Have you heard about Dubbo? What is the principle of Dubbo

Dubbo is a high-performance, lightweight distributed service framework developed by Alibaba Group.

It adopts the core concept of the distributed service framework, provides a distributed service governance solution based on RPC (remote procedure call), supports multiple protocols and registration centers, can easily implement the microservice architecture, and helps developers quickly build distributed application.

The principle of Dubbo is based on Java's remote call framework, using Java's reflection mechanism and dynamic proxy technology.

It adopts the idea based on SOA (Service-Oriented Architecture), encapsulates business logic into services, and then makes remote calls through the RPC protocol. Based on the RPC protocol, a simplified serialization and deserialization method is adopted, that is, a byte array-based serialization and deserialization method. It distributes requests to multiple providers by establishing a load balancer in the network, and ensures high availability and stability through a set of timeout mechanisms and retry strategies.

Dubbo also provides a variety of load balancing strategies and routing strategies, which can be configured according to different scenarios.

Dubbo also supports a variety of registration centers, including Zookeeper, Redis, and Multicast, etc., which can realize automatic registration and discovery of services. In addition, Dubbo also provides a wealth of monitoring and management functions, which can easily monitor and manage services.

The principle of Dubbo can be summarized as:

  1. Define the interface: Use the Java interface to define the service interface, including information such as request and response message formats, parameters, and return value types.
  2. Generate code: use the Dubbo plug-in to convert the Java interface to the Dubbo interface, and generate the corresponding Dubbo service interface file.
  3. Implement the service: implement the service interface on the server side, and add error handling and other functions as needed.
  4. Configure the registration center: configure Dubbo's registration center, such as Zookeeper or Nacos, to manage the registration and discovery of services.
  5. Start the service provider: start the Dubbo service on the service provider, listen to the specified port, and wait for the client's request.
  6. Send request: The client calls the method of the service interface and sends the request message to the Dubbo server.
  7. Load balancing: Dubbo will select an appropriate service provider for processing according to a certain load balancing strategy.
  8. Parse the request: After receiving the request, the Dubbo server parses it into the corresponding method call, and forwards the request message to the service provider.
  9. Execution method: The service provider executes the corresponding method and sends the result message to the Dubbo server.
  10. Parse the response: After receiving the response, the Dubbo server parses it into the corresponding result message, and forwards the response message to the client.

Since Dubbo uses efficient communication protocols and load balancing algorithms, it has high performance and reliability. In addition, Dubbo also supports cluster deployment, dynamic proxy and other functions, which can meet the needs of different scenarios.

5. Briefly introduce the principle of Hystrix, how does he realize the fusing

Hystrix is ​​an open source, fault-tolerant and latency-tolerant library developed by Netflix.

It provides a current limiting framework that can be used in high concurrency scenarios. It prevents the system from crashing due to excessive concurrency by adding some additional components in the system, such as current limiters and token buckets. It can help developers deal with delays and failures in distributed systems, and improve system availability and stability.

Hystrix high-level diagram

The principle of Hystrix is ​​based on the circuit breaker mode, which can monitor the delay and error rate of service calls, and automatically fuse according to the preset threshold.

When the service call fails or times out, Hystrix will automatically switch to the standby service or return to the preset default value, avoiding the cascading failure of the service. Hystrix also provides real-time monitoring and statistical functions, which can help developers understand the system's operating status and performance bottlenecks.

The process of Hystrix implementing fusing is as follows:

  1. When a service call fails or times out, Hystrix will record this event and make judgments based on preset thresholds.
  2. If the failure or timeout event reaches a preset threshold, Hystrix will automatically open the circuit breaker and stop the call to the service.
  3. In the open state of the circuit breaker, Hystrix will automatically switch to the alternate service or return to the preset default value.
  4. Over a period of time, Hystrix will periodically try to call the service, and if the call is successful, the circuit breaker will be closed, otherwise it will remain open.

Through the fuse mechanism, Hystrix can avoid cascading failures of services and improve the availability and stability of the system.

6. If you were asked to design a fuse, how would you design its fusing logic?

Designing the fusing logic of a fuse needs to consider the following aspects:

1. Fuse threshold : The threshold of the fuse should be configurable, and a reasonable threshold can be set according to the actual situation of the system. This threshold should be dynamically adjustable according to the pressure in the system.

1) Define the fuse condition: The fuse needs to define the conditions that trigger the fuse, for example: the error rate reaches a certain threshold, the request timeout rate reaches a certain threshold, and so on.

2) Fuse state: The fuse needs to have three states: closed, open and half-open state. In the closed state, the request will pass normally; in the open state, the request will be intercepted by the fuse; in the half-open state, the fuse will try to send a part of the request, if the request is successful, the fuse will enter the closed state, otherwise it will enter the open state.

2. Fuse time : The fuse should be able to respond quickly within the specified time, and this time should be short enough to ensure that the system can quickly return to normal.

1) The blowing time of the fuse: The fuse needs to define a blowing time. During this time, all requests will be intercepted by the fuse until the blowing time ends. During the fuse time, the fuse will record all failed requests for subsequent analysis and processing.

2) Recovery time of the fuse: The fuse needs to define a recovery time, during which time the fuse is in a half-open state. In the half-open state, the fuse will try to send a part of the request. If the request is successful, the fuse will enter the closed state, otherwise it will enter the open state. After the recovery time has elapsed, the fuse will re-enter the closed state.

3. Judging whether it is blown : After receiving the request at the server, first check whether the current request is within the blown range. If yes, directly return the preset error message or default response result; if not, continue to perform subsequent operations.

4. Fuse mode : The fuse should be able to trigger the fuse in a variety of ways, such as exceeding a threshold, exceeding a specified time, etc.

5. Monitoring and alarming : The monitoring system monitors the availability and performance of the service in real time. Once the service is found to be abnormal or the load is too high, the fuse mechanism will be triggered immediately and an alarm message will be sent to the administrator.

6. Fuse retry : The fuse should support automatic retry after fusing, so that the system can return to normal as soon as possible. If the service is running normally, but the request fails due to reasons such as the network, you can set a retry mechanism. When the request fails, you can try to resend the request until it succeeds.

7. Error handling : The circuit breaker should be able to handle errors, for example, when an exception occurs in the system, the circuit breaker should be able to automatically exit and perform error handling instead of simply rerouting the request to another system.

8. Reliability : The fuse should be reliable. Even if the system is abnormal, the fuse should work normally to avoid system crash.

9. Dynamic adjustment : According to the actual situation of the system and user feedback, dynamically adjust parameters such as threshold and retry mechanism to improve the stability and reliability of the system.

Fuses are only a protection mechanism and are not a complete replacement for fault tolerance and resilience.

Therefore, while designing the fuse, it is also necessary to consider other optimization measures, such as increasing the cache, optimizing the algorithm, etc., in order to improve the performance and reliability of the system.

7. What are Redis cache penetration, breakdown, and avalanche, and how to solve them?

Redis cache penetration, cache breakdown, and cache avalanche are all common cache problems, and different solutions need to be adopted for different problems.

1. Cache penetration

Cache penetration refers to querying for a data that does not exist. Since the data does not exist in the cache, the request penetrates to the database, thereby putting pressure on the database. The methods to solve the cache penetration problem are:

  • Bloom filter: Use the Bloom filter to filter the request. If the requested data does not exist, it will be returned directly to avoid the request from penetrating to the database.
  • Cache empty objects: Cache non-existent data, set expiration time, return empty data directly from the cache in the next request, and avoid request penetration to the database.

2. Cache breakdown

Cache breakdown means that a hot data expires or is cleared in the cache, causing a large number of requests to access the database at the same time, thus putting pressure on the database. The methods to solve the cache breakdown problem are:

  • Hotspot data never expires: Set hotspot data to never expire to prevent cache invalidation from causing requests to penetrate to the database.
  • Locking: When the cache is invalid, use distributed locks or mutex locks to avoid a large number of requests from accessing the database at the same time.
  • Current limiting: Use the current limiting algorithm to limit requests to avoid a large number of requests from accessing the database at the same time.

3. Cache Avalanche

A cache avalanche means that a large amount of data in the cache becomes invalid at the same time, causing a large number of requests to access the database at the same time, thereby putting pressure on the database. The methods to solve the cache avalanche problem are:

  • Random data expiration time: Set the expiration time of cached data to a random time to avoid simultaneous invalidation of a large number of caches.
  • Data preheating: when the system starts, load hotspot data into the cache to avoid a large number of requests to access the database when the cache is invalid.
  • Distributed deployment: Deploy the cache on multiple nodes to avoid a cache avalanche caused by a node failure.

In short, in order to solve the failure problem of Redis cache, multiple factors need to be considered comprehensively, including cache design, access mode, data structure, algorithm and so on. In practical applications, appropriate technologies and solutions can be selected for optimization and adjustment according to specific conditions.

8. How to implement Redis distributed lock? What issues need to be considered

The implementation of Redis distributed lock can be realized by using setnx (set if not exists) command and expire command.

Specifically, you can use the single-threaded feature of Redis to set a lock through the setnx command. If the return value is 1, it means that the lock has been successfully acquired, otherwise it means that the lock has been acquired by other clients. Then you can use the expire command to set the expiration time of the lock to prevent the lock from being occupied and not released.

Questions to consider include:

  1. Lock granularity: The lock granularity should be as small as possible to avoid lock competition and deadlock.
  2. Lock timeout period: The lock timeout period should be set according to business needs, so as to avoid the lock being occupied and not released.
  3. Lock reentrancy: If the same lock is acquired multiple times in the same thread, it should be guaranteed that the lock can be successfully acquired instead of waiting all the time.
  4. Lock release: When releasing the lock, you should first determine whether the lock belongs to the current client, so as to avoid releasing the locks of other clients by mistake.
  5. Lock fault tolerance: When acquiring a lock, factors such as network delay should be considered to avoid the failure of the entire business process due to a failure to acquire a lock once.
  6. Implementation methods of locks: There are many implementation methods of Redis distributed locks, such as using Lua scripts and Redlock algorithms, etc. You need to choose the most appropriate implementation method according to the actual situation.
  7. Data consistency in a distributed environment: In a distributed environment, multiple nodes may request locks at the same time, so it is necessary to ensure that the instance of the distributed lock can still work correctly in the distributed environment, and ensure that the locks in the distributed environment data consistency.
  8. High concurrency: Distributed locks need to support high concurrency access and ensure that they can still work correctly under high concurrency conditions.
  9. Deadlock risk: In a distributed environment, since multiple nodes may request locks from each other, it is necessary to consider how to avoid deadlocks.
  10. Authorization control: Distributed locks need to support authorization control, that is, to be able to control which nodes can obtain locks, and to implement lock revocation on the basis of authorization control.
  11. Fault recovery: Distributed locks need to support fault recovery, that is, in the case of node failure or network abnormality, the lock can automatically resume normal work.
  12. Performance: Distributed locks need to consider performance issues, such as lock response time, lock granularity, etc.

To sum up, implementing Redis distributed locks needs to consider many aspects, including data consistency, high concurrency, deadlock risk, reentrancy, authorization control, failure recovery, and performance.

9. What other uses does Zookeeper have besides the registration center?

In addition to being a registration center, Zookeeper also has the following uses:

  1. Configuration management: Zookeeper can be used to store and manage the configuration information of the application. When the configuration changes, the application can be updated accordingly through Zookeeper.
  2. Distributed locks: Zookeeper can be used to implement distributed locks to avoid the problem of multiple clients accessing shared resources at the same time.
  3. Distributed queue: Zookeeper can be used to implement a distributed queue for coordinating task scheduling among multiple nodes.
  4. Cluster management: Zookeeper can be used to manage node information in the cluster, such as node status and health status.
  5. Distributed coordination: Zookeeper can be used to achieve distributed coordination, such as election algorithms, distributed transactions, etc.
  6. Storing status information: Zookeeper can be used to store status information, such as the status of orders, the quantity of inventory, etc.
  7. Realize service discovery: Zookeeper can be used to realize service discovery, for example, in a distributed system, Zookeeper can be used to find the address of the service.
  8. Supports dynamic addition and removal of nodes: Zookeeper can support dynamic addition and removal of nodes, which makes it a very flexible and extensible tool.
  9. Support load balancing: Zookeeper can be used to achieve simple load balancing. For example, in a distributed system, Zookeeper can be used to achieve load balancing for different services.
  10. Naming service: Zookeeper can provide a globally unique naming service, so that different applications can access the same resource by name.

In short, Zookeeper, as a distributed coordination service, can be used to solve various coordination problems in distributed systems and improve the availability, reliability and scalability of the system.

10. How is Zookeeper's distributed lock implemented?

Zookeeper's distributed lock is implemented based on a mechanism called "Zookeeper Watch".

In Zookeeper, a node can register a Watch to monitor the changes of other nodes. When the state of a node changes, the node will notify all the nodes registered with the Watch, so that all nodes can obtain the changed information in a timely manner.

In Zookeeper's distributed lock, the node will use a Watch to monitor the status of other nodes. If the status of other nodes is found to change, it can be considered that the node has been acquired by other nodes.

Once a node acquires the lock, other nodes can no longer acquire the lock.

The process of Zookeeper implementing distributed locks can be divided into the following steps:

  1. Create a temporary ordered node: Each client creates a temporary ordered node on Zookeeper. The name of the node is lock, for example /lock/lock-0001.
  2. Obtain all child nodes: The client obtains all child nodes under the /lock node through Zookeeper's API, and sorts them according to the serial number of the node name from small to large.
  3. Determine whether you have obtained the lock: If the node created by the client is the node with the smallest sequence number among all child nodes, it means that the client has obtained the lock and can execute the corresponding business logic; otherwise, the client needs to monitor the node with a smaller sequence number than itself. Delete event, when a node with a serial number smaller than its own is deleted, perform steps 2 and 3 again until the lock is obtained.
  4. Release lock: After the client executes the business logic, it needs to delete the node it created to release the lock.

It should be noted that Zookeeper also needs to consider the following issues when implementing distributed locks:

  1. Uniqueness of node names: If multiple clients create nodes with the same name at the same time, it may cause lock competition, and the uniqueness of node names needs to be guaranteed.
  2. Timing of node deletion: If a node is accidentally deleted while the client is executing business logic, it may cause other clients to acquire locks, and the timing of node deletion needs to be considered.
  3. Network delay and failure: If the network delay or Zookeeper node failure may cause the client to fail to acquire the lock, it is necessary to consider how to deal with these abnormal situations.

Zookeeper's distributed lock mechanism is very reliable and safe, because in the process of acquiring locks, all nodes need to be verified, and only nodes that meet certain conditions can acquire locks. At the same time, Zookeeper's distributed lock also supports dynamic locking and unlocking, which makes it a very flexible and scalable distributed lock tool.

In short, the process of implementing distributed locks in Zookeeper is relatively complicated, and many situations need to be considered. However, implementing distributed locks through Zookeeper can avoid the problem of multiple clients accessing shared resources at the same time, and improve the availability and stability of the system.

11. Talk about the principle of AQS

AQS (AbstractQueuedSynchronizer) is a distributed lock algorithm proposed by Eric Brewer et al. in 2000.

AQS can ensure that access to shared resources by multiple nodes in a distributed system is orderly and mutually exclusive, that is, either all nodes can access shared resources, or all nodes cannot access shared resources.

AQS is the basic framework for implementing locks and synchronizers in Java. It provides a general mechanism for implementing blocking locks and related synchronizers, such as ReentrantLock, CountDownLatch, Semaphore, etc.

The core of AQS is a doubly linked list for storing waiting threads. Each node represents a waiting thread, and the node contains information such as the state of the thread, the waiting time, the predecessor node and the successor node. AQS uses CAS (Compare and Swap) operations to implement atomic update of state and thread blocking and wake-up.

The state of AQS is a variable of type int, which represents the state of the synchronizer. In Lock implementations, the state usually represents the lock holder or the number of lock reentries. In synchronizers such as CountDownLatch and Semaphore, the state represents the amount of resources available.

AQS provides two modes: exclusive mode and shared mode. Exclusive mode allows only one thread to acquire the lock, and shared mode allows multiple threads to acquire the lock at the same time. In exclusive mode, AQS uses a FIFO queue to store waiting threads, and in shared mode, AQS uses a CLH queue to store waiting threads.

The implementation of AQS is based on the template method design pattern, which defines some abstract methods, such as tryAcquire, tryRelease, tryAcquireShared, tryReleaseShared, etc., which are implemented by specific synchronizers. When using AQS to implement a synchronizer, we only need to inherit the AQS class and implement these abstract methods.

The basic idea of ​​AQS is:

In a distributed system, each node maintains a counter that represents the status of the node's requests for shared resources. When a node makes a request for a shared resource, the node increments the counter by one. When a node successfully acquires a shared resource, the node decrements the counter by 1.

There are three important concepts in AQS: node, resource and state.

  • A node is an independent computer instance in a distributed system, which can be any type of computer instance, such as a server, workstation, etc.
  • Resources are shared resources, such as shared files, shared database connections, and so on.
  • State refers to the current state of the resource, such as the read state of a file, the idle state of a database connection, and so on.

In AQS, nodes communicate asynchronously, and nodes do not directly access other nodes.

When a node requests resources, it will send the request information to other nodes, and these nodes will process the request and return the request result. If all nodes can access the shared resource, the node will forward the request information to all nodes, and wait for all nodes to return the response to the request.

If a node cannot access the shared resource, the node will put the request information into a queue and wait for other nodes to access the shared resource before processing the request.

There are some important operations in AQS, such as acquiring locks, releasing locks, trying to acquire locks, etc.

  • Acquiring a lock is when a node requests access to a shared resource.
  • Releasing a lock means that a node releases the access rights it has acquired to a shared resource.
  • Attempting to acquire a lock means that a node requests access to a shared resource. If the acquisition of the lock fails, the node attempting to acquire the lock will release the acquired access.

AQS is a very reliable and secure distributed lock algorithm, which can realize orderly and mutually exclusive access to shared resources by multiple nodes in a distributed system.

In short, AQS is the basic framework for implementing locks and synchronizers in Java, which provides a general mechanism to implement blocking locks and related synchronizers. The core of AQS is a doubly linked list, which realizes atomic update of state and blocking and awakening of threads through CAS operation. AQS provides exclusive mode and shared mode, as well as some abstract methods, which can easily implement various synchronizers.

12. Talk about how to implement fair locks and unfair locks in JUC

In JUC (Java Util Concurrent), fair locks and unfair locks are implemented differently.

A fair lock means that multiple threads acquire locks in the order in which they apply for locks. That is, the principle of first come, first served, the order in which threads acquire locks is allocated according to the order in which threads are locked.

The fair lock is implemented by maintaining a FIFO queue. In the implementation of a fair lock, when a thread requests to acquire a lock, if it finds that there is already a waiting thread in the queue, the current thread will be added to the end of the queue, waiting for the previous thread to acquire the lock and release it before trying to acquire the lock .

Although the implementation of fair locks ensures the fairness of locks, the performance of fair locks will be lower than that of unfair locks in high-concurrency scenarios because the operations of locking and releasing locks require frequent queue operations.

Unfair lock means that the order in which multiple threads acquire locks is uncertain, and it is possible that the thread that applies later acquires the lock before the thread that applies first.

The implementation of unfair lock is to directly assign the lock to the currently applying thread when the lock is released, instead of adding the thread to the waiting queue first. This method can reduce the number of thread context switches and improve the performance of the lock. However, since the order of acquiring unfair locks is uncertain, some threads may not be able to acquire locks all the time, resulting in "starvation".

Java's built-in lock (synchronized) is an unfair lock mechanism, that is, all threads will get the same lock. If multiple threads request the same lock at the same time, only one thread will be able to acquire the lock while the others will have to wait. This lock mechanism cannot guarantee that access to shared resources is orderly and mutually exclusive, that is, multiple threads may access the same shared resource at the same time, resulting in data inconsistency. Java also provides some other lock mechanisms, including ReentrantLock (reentrant lock), ReadWriteLock (read-write lock), etc. These lock mechanisms can better ensure the correctness of data and synchronized access by multiple threads.

Let's take ReentrantLock (reentrant lock) as an example to introduce the implementation of fair locks and unfair locks in JUC.

Fair lock:

public class ReentrantLock {
    
    
    private boolean isLocked = false;
    private Thread lockedBy = null;
    private int waitCount = 0;

    public synchronized void lock() throws InterruptedException {
    
    
        Thread callingThread = Thread.currentThread();
        while (isLocked && lockedBy != callingThread) {
    
    
            wait();
        }
        isLocked = true;
        lockedBy = callingThread;
    }

    public synchronized void unlock() {
    
    
        if (Thread.currentThread() == lockedBy) {
    
    
            isLocked = false;
            notify();
        }
    }
}

In ReentrantLock, the lock() and unlock() methods are implemented. When a thread needs to acquire a lock, it first determines whether the lock is acquired by other threads (isLocked), and if so, waits (wait()) until the lock is available. After acquiring the lock, set the lock holder (lockedBy) to the current thread and wake up other waiting threads (notify()).

Unfair lock:

public class NonfairLock {
    
    
    private boolean isLocked = false;
    private Thread lockedBy = Thread.currentThread();

    public void lock() throws InterruptedException {
    
    
        while (isLocked) {
    
    
            wait();
        }
        isLocked = true;
        lockedBy = Thread.currentThread();
    }

    public void unlock() {
    
    
        isLocked = false;
    }
}

In NonfairLock, the lock() and unlock() methods are implemented. When a thread needs to acquire a lock, it directly calls the wait() method until the lock is available. After acquiring the lock, set the lock holder to the current thread

It should be noted that the fair lock mode may cause thread starvation problems, because some threads may wait for other threads to release the lock. Therefore, the application's needs and performance requirements need to be carefully considered when choosing a fair lock mode.

In short, fair locks and unfair locks are implemented differently. A fair lock ensures the fairness of the lock by maintaining a FIFO queue, while an unfair lock directly assigns the lock to the currently applying thread without guaranteeing the fairness of the lock. The performance of fair locks is low, but the fairness of locks is guaranteed; the performance of unfair locks is high, but some threads may be "starved". In practical applications, we need to select the appropriate lock type according to the actual situation.

Say at the end:

In Nien's (50+) reader community, many, many small partners need to enter a big factory and get a high salary.

The Nien team will continue to combine the real interview questions of some major companies to sort out the learning path for you and see what you need to learn?

I used an article earlier to introduce you to a Didi real question:

" Accept a Didi Offer: From the three experiences of the guy, what do you need to learn? "

These real questions will be included in the most complete and continuously upgraded PDF e-book " Nin's Java Interview Collection " in history.

This article is included in the V69 edition of "Nin's Java Interview Collection".

Basically, if you thoroughly understand Nien's "Ninan Java Interview Collection", it is easy to get offers from big companies.

In addition, if you have any needs in the next issue of Dachang Interview, you can send a message to Nien.

The realization path of technical freedom PDF:

Realize your architectural freedom:

" Have a thorough understanding of the 8-figure-1 template, everyone can do the architecture "

" 10Wqps review platform, how to structure it? This is what station B does! ! ! "

" Alibaba Two Sides: How to optimize the performance of tens of millions and billions of data?" Textbook-level answers are coming "

" Peak 21WQps, 100 million DAU, how is the small game "Sheep a Sheep" structured? "

" How to Scheduling 10 Billion-Level Orders, Come to a Big Factory's Superb Solution "

" Two Big Factory 10 Billion-Level Red Envelope Architecture Scheme "

… more architecture articles, being added

Realize your responsive freedom:

" Responsive Bible: 10W Words, Realize Spring Responsive Programming Freedom "

This is the old version of " Flux, Mono, Reactor Combat (the most complete in history) "

Realize your spring cloud freedom:

" Spring cloud Alibaba Study Bible "

" Sharding-JDBC underlying principle and core practice (the most complete in history) "

" Get it done in one article: the chaotic relationship between SpringBoot, SLF4j, Log4j, Logback, and Netty (the most complete in history) "

Realize your linux freedom:

" Linux Commands Encyclopedia: 2W More Words, One Time to Realize Linux Freedom "

Realize your online freedom:

" Detailed explanation of TCP protocol (the most complete in history) "

" Three Network Tables: ARP Table, MAC Table, Routing Table, Realize Your Network Freedom!" ! "

Realize your distributed lock freedom:

" Redis Distributed Lock (Illustration - Second Understanding - The Most Complete in History) "

" Zookeeper Distributed Lock - Diagram - Second Understanding "

Realize your king component freedom:

" King of the Queue: Disruptor Principles, Architecture, and Source Code Penetration "

" The King of Cache: Caffeine Source Code, Architecture, and Principles (the most complete in history, 10W super long text) "

" The King of Cache: The Use of Caffeine (The Most Complete in History) "

" Java Agent probe, bytecode enhanced ByteBuddy (the most complete in history) "

Realize your interview questions freely:

4000 pages of "Nin's Java Interview Collection" 40 topics

The PDF file update of the above Nien architecture notes and interview questions, ▼Please go to the following [Technical Freedom Circle] official account to get it▼

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/130956376