2022 Interview 200 Questions and Answers Distributed + Microservices + MYSQL + Redis + JVM + Spring

200 questions and answers distributed + microservices + MYSQL + Redis + JVM + Spring, etc.

MD with picture is in the resource https://download.csdn.net/download/m0_47987937/86509554

What are the characteristics of Java object-oriented, how to apply

​ Object-oriented programming is an idea that uses classes and object programming. Everything can be categorized. A class is a high abstraction of things in the world. Different things have different relationships. The encapsulation relationship between a class itself and the outside world, the inheritance relationship between a parent class and subclasses, the relationship between a class and multiple classes polymorphic relationship. Everything is an object, and an object is a specific world thing. The three major characteristics of object-oriented encapsulation, inheritance, and polymorphism. Encapsulation, encapsulation describes the relationship between a class behavior and attributes and other classes, low coupling, high cohesion; inheritance is the relationship between parent classes and subclasses, and polymorphism refers to the relationship between classes.

​ Encapsulation hides the internal implementation mechanism of the class, which can change the internal structure of the class without affecting the use, and also protects the data. Its internal details are hidden from the outside world, and only its access methods are exposed to the outside world. Encapsulation of attributes: users can only access data through pre-customized methods, and can easily add logic control to limit unreasonable operations on attributes; encapsulation of methods: users call methods according to the established method, and do not need to care about the method Internal implementation, easy to use; easy to modify, enhance the maintainability of the code;

Inheritance is to derive new classes from existing classes. New classes can absorb the data attributes and behaviors of existing classes, and can expand new capabilities. In essence, it is a special-general relationship, that is, the is-a relationship that is often said. The subclass inherits the parent class, indicating that the subclass is a special parent class and has some attributes or methods that the parent class does not have. A base class is abstracted from various implementation classes, so that it has the common characteristics of various implementation classes. When the implementation class inherits the base class (parent class) with the extends keyword, the implementation class has these same properties. The inherited class is called a subclass (derived class or superclass), and the inherited class is called a parent class (or base class). For example, an animal class can be abstracted from cats, dogs, and tigers, which has common characteristics (eating, running, barking, etc.) with cats, dogs, and tigers. Java implements inheritance through the extends keyword. Variables and methods defined by private in the parent class will not be inherited, and variables and methods defined by private in the parent class cannot be directly manipulated in the subclass. Inheritance avoids repeated descriptions of the common features between general classes and special classes. Through inheritance, the scope of concepts to which each common feature is applicable can be clearly expressed. The attributes and operations defined in the general class are applicable to the class itself and All objects of each special class below it. Using the principle of inheritance makes the system model more concise and clearer.

​ Compared with encapsulation and inheritance, Java polymorphism is one of the more difficult of the three major features. Encapsulation and inheritance are finally attributed to polymorphism. Polymorphism refers to the relationship between classes and classes. Two classes have an inheritance relationship. There are Method rewriting, so the parent class reference can point to the subclass object when calling. There are three essential elements of polymorphism: inheritance, rewriting, and parent class references pointing to subclass objects.

What is the principle of HashMap, and what is the difference between jdk1.7 and 1.8

HashMap stores data according to the hashCode value of the key. In most cases, its value can be directly located, so it has a fast access speed, but the traversal order is uncertain. HashMap only allows the key of one record to be null at most, and the value of multiple records is allowed to be null. HashMap is not thread-safe, that is, multiple threads can write to HashMap at the same time at any time, which may lead to data inconsistency. If you need to meet thread safety, you can use the synchronizedMap method of Collections to make HashMap thread-safe, or use ConcurrentHashMap. We use the following picture to introduce

The structure of the HashMap.

JAVA7 implementation

请添加图片描述

In the general direction, HashMap is an array, and each element in the array is a one-way linked list. In the figure above, each green

The entity is an instance of the nested class Entry, and Entry contains four attributes: key, value, hash value and next for a one-way linked list.

  1. capacity: The current array capacity, which is always 2^n, can be expanded. After expansion, the array size is twice the current size.

  2. loadFactor: load factor, the default is 0.75.

  3. threshold: the threshold of expansion, equal to capacity * loadFactor

**JAVA8 Implementation**

Java8 has made some modifications to HashMap. The biggest difference is the use of red-black tree, so it is composed of array + linked list + red-black tree.

According to the introduction of Java7 HashMap, we know that when searching, we can quickly locate the specific subscript of the array according to the hash value, but after that, we need to compare one by one along the linked list to find what we need. The time complexity depends on

Due to the length of the linked list, it is O(n). In order to reduce this part of the overhead, in Java8, when the number of elements in the linked list exceeds 8, the linked list will be converted into a red-black tree, and the time complexity can be reduced to O(logN) when searching in these positions.

请添加图片描述

What is the difference between ArrayList and LinkedList

Both ArrayList and LinkedList implement the List interface, and they have the following differences:
ArrayList is an index-based data interface, and its bottom layer is an array. It can perform random access to elements with O(1) time complexity. Correspondingly, LinkedList stores its data in the form of a list of elements, and each element is linked with its previous and next elements. In this case, the time complexity of finding an element is O( n).
Compared with ArrayList, LinkedList's insertion, addition, and deletion operations are faster, because when elements are added to any position in the collection, there is no need to recalculate the size or update the index like an array.
LinkedList takes up more memory than ArrayList because LinkedList stores two references for each node, one pointing to the previous element and one pointing to the next element.
See also ArrayList vs. LinkedList.

  1. Because Array is an index-based data structure, it is very fast to search and read data in the array using the index. The time complexity of Array to get data is O(1), but to delete data is very expensive, because it needs to rearrange all the data in the array.

  2. Compared to ArrayList, LinkedList insertion is faster. Because LinkedList is not like ArrayList, there is no need to change the size of the array, and there is no need to reload all the data into a new array when the array is full. This is the worst case of ArrayList, and the time complexity is O(n), while the time complexity of inserting or deleting in LinkedList is only O(1). ArrayList also needs to update the index when inserting data (in addition to inserting the tail of the array).

  3. Similar to inserting data, LinkedList is also better than ArrayList when deleting data.

  4. LinkedList requires more memory, because the position of each index of ArrayList is the actual data, and each node in the LinkedList stores the actual data and the positions of the previous and subsequent nodes (a LinkedList instance stores two values: Node first and Node last represent the actual node and the end node of the linked list respectively, and each Node instance stores three values: E item,Node next,Node pre).

In what scenario is it more appropriate to use LinkedList instead of ArrayList

  1. Your application does not access data randomly. Because if you need the nth element in LinkedList, you need to count sequentially from the first element to the nth data, and then read the data.

  2. Your application inserts and deletes more elements, and reads less data. Because inserting and deleting elements does not involve rearranging data, it is faster than ArrayList.

In other words, the implementation of ArrayList uses arrays, LinkedList is based on linked lists, ArrayList is suitable for searching, and LinkedList is suitable for adding and deleting

The above is the difference between ArrayList and LinkedList. Whenever you need an asynchronous index-based data access, try to use ArrayList. ArrayList is fast and easy to use. But remember to give an appropriate initial size and reduce the size of the changed array as much as possible.

What are the problems with collections in high concurrency

**The first generation of thread-safe collection classes**

Vector、Hashtable

How to ensure thread scheduling: use synchronized modification method*

Cons: Inefficient

Second-generation thread-unsafe collection classes

ArrayList、HashMap

Thread is not safe, but the performance is good, used to replace Vector, Hashtable

What should I do if I need thread safety when using ArrayList and HashMap?

使用 Collections.synchronizedList(list); Collections.synchronizedMap(m);

Although the bottom layer uses the synchronized code block lock to lock all the code, but the lock is inside the method, and the performance outside the method can be understood as a slight improvement. After all, the method itself must allocate resources

Third-generation thread-safe collection classes

How to improve the efficiency and safety of collections in the case of a large number of concurrency?

java.util.concurrent.*

ConcurrentHashMap:

CopyOnWriteArrayList :

CopyOnWriteArraySet: Note that it is not CopyOnWriteHashSet*

Most of the bottom layers use Lock locks (ConcurrentHashMap 1.8 does not use Lock locks), which ensures high performance while ensuring security.

What are the new features of jdk1.8

1. The default method of the interface

Java 8 allows us to add a non-abstract method implementation to the interface, just use the default keyword. This feature is also called an extension method. Examples are as follows:

code show as below:

interface Formula { double calculate(int a);

default double sqrt(int a) { return Math.sqrt(a); } }

The Formula interface defines the sqrt method in addition to the calculate method. Subclasses that implement the Formula interface only need to implement a calculate method. The default method sqrt can be used directly on the subclass.

code show as below:

Formula formula = new Formula() { @Override public double calculate(int a) { return sqrt(a * 100); } };

formula.calculate(100); // 100.0 formula.sqrt(16); // 4.0

The formula in this article is implemented as an instance of an anonymous class. The code is very easy to understand, and the calculation of sqrt(a * 100) is realized in 6 lines of code. In the next section, we'll see a simpler way of implementing single-method interfaces.

Translator's Note: There is only single inheritance in Java. If you want to give a class new features, you usually use an interface to implement it. In C++, multiple inheritance is supported, allowing a subclass to have multiple parent class interfaces and functions at the same time. In other languages, the method of making a class have other reusable code at the same time is called a mixin. This feature of the new Java 8 is closer to Scala's traits from the perspective of compiler implementation. There is also a concept called extension method in C#, which allows to extend methods to existing types, which is semantically different from Java 8.

2. Lambda expression

First look at how strings are arranged in older versions of Java:

code show as below:

List names = Arrays.asList(“peterF”, “anna”, “mike”, “xenia”);

Collections.sort(names, new Comparator() { @Override public int compare(String a, String b) { return b.compareTo(a); } });

Just pass a List object and a comparator to the static method Collections.sort to arrange in the specified order. The usual practice is to create an anonymous comparator object and pass it to the sort method.

In Java 8, you don't need to use this traditional anonymous object method. Java 8 provides a more concise syntax, lambda expressions:

code show as below:

Collections.sort(names, (String a, String b) -> { return b.compareTo(a); });

See, the code is shorter and more readable, but it can actually be written even shorter:

code show as below:

Collections.sort(names, (String a, String b) -> b.compareTo(a));

For a function body with only one line of code, you can remove the curly braces {} and the return keyword, but you can also write it shorter:

code show as below:

Collections.sort(names, (a, b) -> b.compareTo(a));

The Java compiler can automatically deduce the parameter type, so you don't have to write the type again. Next, let's see what more convenient things lambda expressions can do:

3. Functional interface

How are lambda expressions represented in java's type system? Every lambda expression corresponds to a type, usually an interface type. And "functional interface" refers to an interface that contains only one abstract method, and every lambda expression of this type will be matched to this abstract method. Since default methods are not considered abstract methods, you can also add default methods to your functional interface.

We can treat lambda expressions as any interface type that contains only one abstract method. To ensure that your interface must meet this requirement, you only need to add the @FunctionalInterface annotation to your interface. If the compiler finds that you have marked the interface with this annotation An error will be reported when there is more than one abstract method.

Examples are as follows:

code show as below:

@FunctionalInterface interface Converter<F, T> { T convert(F from); } Converter<String, Integer> converter = (from) -> Integer.valueOf(from); Integer converted = converter.convert(“123”); System.out.println(converted); // 123

It should be noted that if @FunctionalInterface is not specified, the above code is also correct.

Translator's note maps lambda expressions to a single-method interface. This approach has been implemented in other languages ​​before Java 8, such as the Rhino JavaScript interpreter. If a function parameter receives a single-method interface and you pass It is a function, and the Rhino interpreter will automatically make an adapter from a single interface instance to a function. Typical application scenarios include the second parameter EventListener of addEventListener of org.w3c.dom.events.EventTarget.

Fourth, method and constructor reference

The code in the previous section can also be represented by a static method reference:

code show as below:

Converter<String, Integer> converter = Integer::valueOf; Integer converted = converter.convert(“123”); System.out.println(converted); // 123

Java 8 allows you to use the :: keyword to pass method or constructor references, the above code shows how to refer to a static method, we can also refer to an object method:

code show as below:

converter = something::startsWith; String converted = converter.convert(“Java”); System.out.println(converted); // “J”

Next, let's see how constructors are referenced using the :: keyword. First, we define a simple class that contains multiple constructors:

code show as below:

class Person { String firstName; String lastName;

Person() {}

Person(String firstName, String lastName) { this.firstName = firstName; this.lastName = lastName; } }

Next we specify an object factory interface for creating Person objects:

code show as below:

interface PersonFactory

{ P create(String firstName, String lastName); }

Here we use constructor references to link them instead of implementing a full factory:

code show as below:

PersonFactory personFactory = Person::new; Person person = personFactory.create(“Peter”, “Parker”);

We only need to use Person::new to obtain the reference of the Person class constructor, and the Java compiler will automatically select the appropriate constructor according to the signature of the PersonFactory.create method.

5. Lambda scope

Accessing the outer scope in a lambda expression is very similar to the old anonymous object. You can directly access outer local variables marked final, or instance fields and static variables.

6. Access local variables

We can access the outer local variables directly in the lambda expression:

code show as below:

final int num = 1; Converter<Integer, String> stringConverter = (from) -> String.valueOf(from + num);

stringConverter.convert(2); // 3

But unlike the anonymous object, the variable num here does not need to be declared as final, and the code is also correct:

code show as below:

int num = 1; Converter<Integer, String> stringConverter = (from) -> String.valueOf(from + num);

stringConverter.convert(2); // 3

However, the num here must not be modified by the following code (that is, it has implicit final semantics), for example, the following cannot be compiled:

code show as below:

int num = 1; Converter<Integer, String> stringConverter = (from) -> String.valueOf(from + num); num = 3;

Attempting to modify num in a lambda expression is also not allowed.

7. Access object fields and static variables

Different from local variables, the fields and static variables of the lambda are both readable and writable. The behavior is consistent with anonymous objects:

code show as below:

class Lambda4 { static int outerStaticNum; int outerNum;

void testScopes() { Converter<Integer, String> stringConverter1 = (from) -> { outerNum = 23; return String.valueOf(from); };

Converter<Integer, String> stringConverter2 = (from) -> { outerStaticNum = 72; return String.valueOf(from); }; } }

Eight, the default method of access interface

Remember the formula example in the first section, the interface Formula defines a default method sqrt that can be directly accessed by instances of formula including anonymous objects, but this cannot be done in lambda expressions. Default methods cannot be accessed in Lambda expressions, and the following code will not compile:

code show as below:

Formula formula = (a) -> sqrt( a * 100); Built-in Functional Interfaces

JDK 1.8 API contains many built-in functional interfaces, such as Comparator or Runnable interfaces commonly used in old Java, these interfaces have added the @FunctionalInterface annotation so that they can be used on lambdas. The Java 8 API also provides many new functional interfaces to make work more convenient. Some interfaces are from the Google Guava library. Even if you are familiar with these, it is still necessary to see how these are extended to lambdas. in use.

Predicate**** interface

The Predicate interface has only one parameter and returns boolean type. This interface contains a variety of default methods to combine Predicates into other complex logic (such as: and, or, not):

code show as below:

Predicate predicate = (s) -> s.length() > 0;

predicate.test(“foo”); // true predicate.negate().test(“foo”); // false

Predicate nonNull = Objects::nonNull; Predicate isNull = Objects::isNull;

Predicate isEmpty = String::isEmpty; Predicate isNotEmpty = isEmpty.negate();

Function interface

The Function interface takes one parameter and returns a result, with some default methods (compose, andThen) that can be combined with other functions:

code show as below:

Function<String, Integer> toInteger = Integer::valueOf; Function<String, String> backToString = toInteger.andThen(String::valueOf);

backToString.apply("123"); // "123"

Supplier interface The Supplier interface returns a value of any type. Unlike the Function interface, this interface does not have any parameters.

code show as below:

Supplier personSupplier = Person::new; personSupplier.get(); // new Person

Consumer Interface The Consumer interface represents operations performed on a single parameter.

code show as below:

Consumer greeter = § -> System.out.println("Hello, " + p.firstName); greeter.accept(new Person(“Luke”, “Skywalker”));

Comparator interface Comparator is a classic interface in old Java. Java 8 adds a variety of default methods on top of it:

code show as below:

Comparator comparator = (p1, p2) -> p1.firstName.compareTo(p2.firstName);

Person p1 = new Person(“John”, “Doe”); Person p2 = new Person(“Alice”, “Wonderland”);

comparator.compare(p1, p2); // > 0 comparator.reversed().compare(p1, p2); // < 0

Optional interface

Optional is not a function but an interface. This is an auxiliary type used to prevent NullPointerException. This is an important concept that will be used in the next session. Now let’s briefly look at what this interface can do:

Optional is defined as a simple container whose value may be null or not. Before Java 8, a function should generally return a non-null object but occasionally it may return null. In Java 8, it is not recommended that you return null but return Optional.

code show as below:

Optional optional = Optional.of(“bam”);

optional.isPresent(); // true optional.get(); // “bam” optional.orElse(“fallback”); // “bam”

optional.ifPresent((s) -> System.out.println(s.charAt(0))); // “b”

Stream interface

A java.util.Stream represents a sequence of operations that can be applied to a set of elements one at a time. Stream operations are divided into intermediate operations or final operations. The final operation returns a specific type of calculation result, while the intermediate operation returns the Stream itself, so that you can chain multiple operations in sequence. The creation of Stream needs to specify a data source, such as a subclass of java.util.Collection, List or Set, which is not supported by Map. Stream operations can be performed serially or in parallel.

First, let's see how Stream is used. First, create the data List used in the example code:

code show as below:

List stringCollection = new ArrayList<>(); stringCollection.add(“ddd2”); stringCollection.add(“aaa2”); stringCollection.add(“bbb1”); stringCollection.add(“aaa1”); stringCollection.add(“bbb3”); stringCollection.add(“ccc”); stringCollection.add(“bbb2”); stringCollection.add(“ddd1”);

Java 8 extends the collection class, and a Stream can be created through Collection.stream() or Collection.parallelStream(). The following sections explain the commonly used Stream operations in detail:

Filter _

Filtering uses a predicate interface to filter and keep only eligible elements. This operation is an intermediate operation, so we can apply other Stream operations (such as forEach) to the filtered results. forEach needs a function to execute sequentially on the filtered elements. forEach is a final operation, so we cannot perform other Stream operations after forEach.

code show as below:

stringCollection .stream() .filter((s) -> s.startsWith(“a”)) .forEach(System.out::println);

// “aaa2”, “aaa1”

Sort _

Sorting is an intermediate operation that returns a sorted Stream. If you don't specify a custom Comparator the default sort will be used.

code show as below:

stringCollection .stream() .sorted() .filter((s) -> s.startsWith(“a”)) .forEach(System.out::println);

// “aaa1”, “aaa2”

It should be noted that sorting only creates a sorted Stream without affecting the original data source. After sorting, the original data stringCollection will not be modified:

code show as below:

System.out.println(stringCollection); // ddd2, aaa2, bbb1, aaa1, bbb3, ccc, bbb2, ddd1

Map mapping The intermediate operation map will convert the elements into other objects in turn according to the specified Function interface. The following example shows the conversion of a string into an uppercase string. You can also use map to convert objects into other types. The Stream type returned by map is determined by the return value of the function passed into your map.

code show as below:

stringCollection .stream() .map(String::toUpperCase) .sorted((a, b) -> b.compareTo(a)) .forEach(System.out::println);

// “DDD2”, “DDD1”, “CCC”, “BBB3”, “BBB2”, “AAA2”, “AAA1”

Match match

Stream provides a variety of matching operations that allow checking whether a specified Predicate matches the entire Stream. All matching operations are final and return a boolean value.

code show as below:

boolean anyStartsWithA = stringCollection .stream() .anyMatch((s) -> s.startsWith(“a”));

System.out.println(anyStartsWithA); // true

boolean allStartsWithA = stringCollection .stream() .allMatch((s) -> s.startsWith(“a”));

System.out.println(allStartsWithA); // false

boolean noneStartsWithZ = stringCollection .stream() .noneMatch((s) -> s.startsWith(“z”));

System.out.println(noneStartsWithZ); // true

Count Counting is a final operation that returns the number of elements in the Stream, and the return value type is long.

code show as below:

long startsWithB = stringCollection .stream() .filter((s) -> s.startsWith(“b”)) .count();

System.out.println(startsWithB); // 3

Reduce protocol

This is a final operation that allows multiple elements in the stream to be reduced to one element through the specified function, and the result after the reduction is expressed through the Optional interface:

code show as below:

Optional reduced = stringCollection .stream() .sorted() .reduce((s1, s2) -> s1 + “#” + s2);

reduced.ifPresent(System.out::println); // “aaa1#aaa2#bbb1#bbb2#bbb3#ccc#ddd1#ddd2”

Parallel****Streams

As mentioned earlier, there are two types of Stream, serial and parallel. The operations on the serial Stream are completed sequentially in one thread, while the parallel Stream is executed on multiple threads at the same time.

The following example shows how to improve performance by parallelizing Streams:

First we create a large table with no repeated elements:

code show as below:

int max = 1000000; List values = new ArrayList<>(max); for (int i = 0; i < max; i++) { UUID uuid = UUID.randomUUID(); values.add(uuid.toString()); }

Then we calculate how long it takes to sort the Stream, serial sorting:

code show as below:

long t0 = System.nanoTime();

long count = values.stream().sorted().count(); System.out.println(count);

ship t1 = System.nanoTime();

long millis = TimeUnit.NANOSECONDS.toMillis(t1 - t0); System.out.println(String.format(“sequential sort took: %d ms”, millis));

// Serial time: 899 ms Parallel sorting:

code show as below:

long t0 = System.nanoTime();

long count = values.parallelStream().sorted().count(); System.out.println(count);

ship t1 = System.nanoTime();

long millis = TimeUnit.NANOSECONDS.toMillis(t1 - t0); System.out.println(String.format(“parallel sort took: %d ms”, millis));

// Parallel sorting time: 472 ms The above two codes are almost the same, but the parallel version is as much as 50% faster. The only change that needs to be made is to change stream() to parallelStream().

Map

As mentioned earlier, the Map type does not support streams, but Map provides some new and useful methods to handle some daily tasks.

code show as below:

Map<Integer, String> map = new HashMap<>();

for (int i = 0; i < 10; i++) { map.putIfAbsent(i, “val” + i); }

map.forEach((id, val) -> System.out.println(val)); The above code is easy to understand, putIfAbsent does not require us to do additional existence checks, and forEach receives a Consumer interface to process the Operate on each key-value pair.

The following example shows other useful functions on map:

code show as below:

map.computeIfPresent(3, (num, val) -> val + num); map.get(3); // val33

map.computeIfPresent(9, (num, val) -> null); map.containsKey(9); // false

map.computeIfAbsent(23, num -> “val” + num); map.containsKey(23); // true

map.computeIfAbsent(3, num -> “bam”); map.get(3); // val33

Next, show how to delete an item whose key and value all match in the Map:

code show as below:

map.remove(3, “val3”); map.get(3); // val33

map.remove(3, “val33”); map.get(3); // null

Another useful method:

code show as below:

map.getOrDefault(42, “not found”); // not found

It is also easy to merge the elements of the Map:

code show as below:

map.merge(9, “val9”, (value, newValue) -> value.concat(newValue)); map.get(9); // val9

map.merge(9, “concat”, (value, newValue) -> value.concat(newValue)); map.get(9); // val9concat

What Merge does is to insert if the key name does not exist, otherwise, merge the value corresponding to the original key and reinsert it into the map.

九、Date API

Java 8 includes a new set of date and time APIs under the package java.time. The new date API is similar to the open source Joda-Time library, but not exactly the same. The following example shows some of the most important parts of this new API:

Clock clock

The Clock class provides methods to access the current date and time. Clock is time zone sensitive and can be used to replace System.currentTimeMillis() to obtain the current number of microseconds. A specific point in time can also be represented using the Instant class, which can also be used to create old java.util.Date objects.

code show as below:

Clock clock = Clock.systemDefaultZone(); long millis = clock.millis();

Instant instant = clock.instant(); Date legacyDate = Date.from(instant); // legacy java.util.Date

TimezonesTime zone

In the new API, the time zone is represented by ZoneId. The time zone can be easily obtained using the static method of. The time zone defines the time difference to UTS time, which is extremely important when converting between Instant time point objects and local date objects.

code show as below:

System.out.println(ZoneId.getAvailableZoneIds()); // prints all available timezone ids

ZoneId zone1 = ZoneId.of(“Europe/Berlin”); ZoneId zone2 = ZoneId.of(“Brazil/East”); System.out.println(zone1.getRules()); System.out.println(zone2.getRules());

// ZoneRules[currentStandardOffset=+01:00] // ZoneRules[currentStandardOffset=-03:00]

LocalTime local time

LocalTime defines a time without time zone information, such as 10 PM, or 17:30:15. The following example creates two local times using the time zone created by the previous code. Then compare the times and calculate the time difference between the two times in hours and minutes:

code show as below:

LocalTime now1 = LocalTime.now(zone1); LocalTime now2 = LocalTime.now(zone2);

System.out.println(now1.isBefore(now2)); // false

long hoursBetween = ChronoUnit.HOURS.between(now1, now2); long minutesBetween = ChronoUnit.MINUTES.between(now1, now2);

System.out.println(hoursBetween); // -3 System.out.println(minutesBetween); // -239

LocalTime provides several factory methods to simplify object creation, including parsing time strings.

code show as below:

LocalTime late = LocalTime.of(23, 59, 59); System.out.println(late); // 23:59:59

DateTimeFormatter germanFormatter = DateTimeFormatter .ofLocalizedTime(FormatStyle.SHORT) .withLocale(Locale.GERMAN);

LocalTime leetTime = LocalTime.parse(“13:37”, germanFormatter); System.out.println(leetTime); // 13:37

LocalDate local date

LocalDate represents an exact date, such as 2014-03-11. The object value is immutable, and it is basically the same as LocalTime in use. The following example shows how to add and subtract days/months/years to a Date object. Also note that these objects are immutable and operations always return a new instance.

code show as below:

LocalDate today = LocalDate.now(); LocalDate tomorrow = today.plus(1, ChronoUnit.DAYS); LocalDate yesterday = tomorrow.minusDays(2);

LocalDate independenceDay = LocalDate.of(2014, Month.JULY, 4); DayOfWeek dayOfWeek = independenceDay.getDayOfWeek();

System.out.println(dayOfWeek); // FRIDAY Parsing a LocalDate from a string is as easy as parsing a LocalTime:

code show as below:

DateTimeFormatter germanFormatter = DateTimeFormatter .ofLocalizedDate(FormatStyle.MEDIUM) .withLocale(Locale.GERMAN);

LocalDate xmas = LocalDate.parse(“24.12.2014”, germanFormatter); System.out.println(xmas); // 2014-12-24

LocalDateTime local date time

LocalDateTime represents time and date at the same time, which is equivalent to merging the contents of the previous two sections into one object. LocalDateTime, like LocalTime and LocalDate, are immutable. LocalDateTime provides some methods to access specific fields.

code show as below:

LocalDateTime sylvester = LocalDateTime.of(2014, Month.DECEMBER, 31, 23, 59, 59);

DayOfWeek dayOfWeek = sylvester.getDayOfWeek(); System.out.println(dayOfWeek); // WEDNESDAY

Month month = sylvester.getMonth(); System.out.println(month); // DECEMBER

long minuteOfDay = sylvester.getLong(ChronoField.MINUTE_OF_DAY); System.out.println(minuteOfDay); // 1439

As long as the time zone information is attached, it can be converted into a point-in-time Instant object, and the Instant point-in-time object can be easily converted into an old-fashioned java.util.Date.

code show as below:

Instant instant = sylvester .atZone(ZoneId.systemDefault()) .toInstant();

Date legacyDate = Date.from(instant); System.out.println(legacyDate); // Wed Dec 31 23:59:59 CET 2014

Formatting LocalDateTime is the same as formatting time and date. In addition to using predefined formats, we can also define formats ourselves:

code show as below:

DateTimeFormatter formatter = DateTimeFormatter .ofPattern(“MMM dd, yyyy - HH:mm”);

LocalDateTime parsed = LocalDateTime.parse(“Nov 03, 2014 - 07:13”, formatter); String string = formatter.format(parsed); System.out.println(string); // Nov 03, 2014 - 07:13

Unlike java.text.NumberFormat, the new DateTimeFormatter is immutable, so it is thread-safe.

10. Annotation Notes

Multiple annotations are supported in Java 8, let's look at an example to understand what it means. First define a wrapper class Hints annotation to place a set of specific Hint annotations:

code show as below:

@interface Hints { Hint[] value(); }

@Repeatable(Hints.class) @interface Hint { String value(); }

Java 8 allows us to use the same type of annotation multiple times, just mark the annotation with @Repeatable.

Example 1: Using a wrapper class as a container to store multiple annotations (old method)

code show as below:

@Hints({@Hint(“hint1”), @Hint(“hint2”)}) class Person {}

Example 2: Using multiple annotations (new method)

code show as below:

@Hint(“hint1”) @Hint(“hint2”) class Person {}

In the second example, the java compiler will implicitly define the @Hints annotation for you. Knowing this will help you use reflection to obtain this information:

code show as below:

Hint hint = Person.class.getAnnotation(Hint.class); System.out.println(hint); // null

Hints hints1 = Person.class.getAnnotation(Hints.class); System.out.println(hints1.value().length); // 2

Hint[] hints2 = Person.class.getAnnotationsByType(Hint.class); System.out.println(hints2.length); // 2

Even if we don't define the @Hints annotation on the Person class, we can still get the @Hints annotation through getAnnotation(Hints.class). A more convenient way is to use getAnnotationsByType to directly get all the @Hint annotations. In addition, Java 8 annotations have been added to two new targets:

code show as below:

@Target({ElementType.TYPE_PARAMETER, ElementType.TYPE_USE}) @interface MyAnnotation {}

This is all about the new features of Java 8, and there are definitely more features waiting to be discovered. There are many useful things in JDK 1.8, such as Arrays.parallelSort, StampedLock and CompletableFuture, etc.

What is the difference between rewriting and overloading in Java

Both method overloading and rewriting are ways to achieve polymorphism. The difference is that the former implements compile-time polymorphism, while the latter implements run-time polymorphism.

sex. Overloading occurs in a class, and methods with the same name that have different parameter lists (different parameter types, different number of parameters, or both) are treated as

Overloading; rewriting occurs between the subclass and the parent class, and rewriting requires that the overridden method of the subclass has the same return type as the overridden method of the parent class, which is higher than the overridden method of the parent class.

The method is better accessible, and cannot declare more exceptions than the overridden method of the parent class (Liskov substitution principle). Overloading has no special requirements on the return type.

Rules for method overloading:

1. The method names are the same, but the order, type and number of parameters in the parameter list are different.

2. Overloading has nothing to do with the return value of the method, and exists in the parent class and subclasses, in the same class.

3. Different exceptions can be thrown and different modifiers can be used

Rules for method rewriting:

1. The parameter list must be completely consistent with that of the overridden method, and the return type must be completely consistent with the return type of the overridden method.

2. The construction method cannot be rewritten, the method declared as final cannot be rewritten, the method declared as static cannot be rewritten, but can be rewritten

statement.

3. The access rights cannot be lower than the access rights of the overridden methods in the parent class.

4. The overridden method can throw any unchecked exception (UncheckedException, also called non-runtime exception), regardless of the overridden method

No exception is thrown. However, the overridden method cannot throw new mandatory exceptions, or mandatory exceptions wider than those declared by the overridden method, and vice versa

Can.

What is the difference between interface and abstract class

different:

Abstract class:

1. Constructors can be defined in abstract classes

2. There can be abstract methods and concrete methods

3. All members in the interface are public

4. Member variables can be defined in abstract classes

5. Classes with abstract methods must be declared as abstract classes, and abstract classes do not necessarily have abstract methods

6. Abstract classes can contain static methods

7. A class can only inherit one abstract class

interface:

1. The constructor cannot be defined in the interface

2. All methods are abstract methods

3. The members in the abstract class can be private, default, protected, public

4. The member variables defined in the interface are actually constants

5. There cannot be static methods in the interface

6. A class can implement multiple interfaces

same:

1. Cannot be instantiated

2. Abstract classes and interface types can be used as reference types

3. If a class inherits a certain abstract class or implements a certain interface, it needs to implement all the abstract methods in it, otherwise the class still needs

is declared as an abstract class

How to declare that a class will not be inherited, and in what scenarios will it be used

If a class is final modified, this class cannot have subclasses, and cannot be inherited by other classes. If all methods in one do not need to be rewritten, and the current class has no subclasses, you can use final modified classes.

What is the difference between == and equals in Java

The biggest difference between equals and == is that one is a method and the other is an operator.

==: If the object to be compared is a basic data type, then the value is compared for equality; if the object to be compared is a reference data type, then the object is compared

address values ​​are equal.

equals(): Used to compare whether the contents of two objects are equal.

Note: The equals method cannot be used for variables of basic data types. If the equals method is not rewritten, the comparison is for variables of reference types.

The address of the object pointed to by the volume.

Differences and usage scenarios of String, StringBuffer, and StringBuilder

The Java platform provides two types of strings: String and StringBuffer/StringBuilder, both of which can store and manipulate strings. The difference is

as follows.

1) String is a read-only string, which means that the content of the string referenced by String cannot be changed. Beginners may have such misunderstandings:

String str = “abc”;
str = “bcd”;

As above, the string str can obviously be changed! In fact, str is just a reference object, which points to a string object "abc". No.

The meaning of the two lines of code is to make str point to a new string "bcd" object, and the "abc" object has not changed, but the object has been

It has become an unreachable object.

2) The string object represented by StringBuffer/StringBuilder can be directly modified.

3) StringBuilder was introduced in Java5, it is exactly the same as StringBuffer, the difference is that it is used in a single-threaded environment,

Because all its methods are not modified by synchronized, its efficiency is theoretically higher than StringBuffer.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-xsAqGin6-1662355491057)(images/StringBuilder.png)]

Several Realization Ways of Java Agent

The first type: static proxy, which can only statically proxy certain classes or certain methods, is not recommended, the function is relatively weak, but the coding is simple

The second type: dynamic proxy, including Proxy proxy and CGLIB dynamic proxy

Proxy proxy is a dynamic proxy built into JDK

​ Features: Interface-oriented, no need to import three-party dependent dynamic proxy, can enhance multiple different interfaces, when reading annotations through reflection, only the annotations on the interface can be read

​Principle: Interface-oriented, only the methods defined by the implementation class in the implementation interface can be enhanced

Define interface and implementation

package com.proxy;

public interface UserService {
    public String getName(int id);

    public Integer getAge(int id);
}
package com.proxy;

public class UserServiceImpl implements UserService {
    @Override
    public String getName(int id) {
        System.out.println("------getName------");
        return "riemann";
    }

    @Override
    public Integer getAge(int id) {
        System.out.println("------getAge------");
        return 26;
    }
}
package com.proxy;

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;

public class MyInvocationHandler implements InvocationHandler {

    public Object target;

    MyInvocationHandler() {
        super();
    }

    MyInvocationHandler(Object target) {
        super();
        this.target = target;
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ("getName".equals(method.getName())) {
            System.out.println("++++++before " + method.getName() + "++++++");
            Object result = method.invoke(target, args);
            System.out.println("++++++after " + method.getName() + "++++++");
            return result;
        } else {
            Object result = method.invoke(target, args);
            return result;
        }
    }
}
package com.proxy;

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Proxy;

public class Main1 {
    public static void main(String[] args) {
        UserService userService = new UserServiceImpl();
        InvocationHandler invocationHandler = new MyInvocationHandler(userService);
        UserService userServiceProxy = (UserService) Proxy.newProxyInstance(userService.getClass().getClassLoader(),
                userService.getClass().getInterfaces(),invocationHandler);
        System.out.println(userServiceProxy.getName(1));
        System.out.println(userServiceProxy.getAge(1));
    }
}

CGLIB dynamic proxy

​ Features: dynamic proxy for the parent class, need to import third-party dependencies

​ Principle: facing the parent class, the bottom layer implements enhancement in the form of subclasses inheriting the parent class and rewriting methods

Proxy and CGLIB are very important proxy modes, and they are the main two ways of spring AOP underlying implementation

The core classes of CGLIB:
net.sf.cglib.proxy.Enhancer – the main enhancement class
net.sf.cglib.proxy.MethodInterceptor – the main method interception class, which is a sub-interface of the Callback interface, which requires users to implement
net.sf. cglib.proxy.MethodProxy – the proxy class of the java.lang.reflect.Method class of JDK, which can conveniently implement the call to the source object method, such as using:
Object o = methodProxy.invokeSuper(proxy, args);//Although the first One parameter is the proxied object, and there will be no problem of infinite loop.

The net.sf.cglib.proxy.MethodInterceptor interface is the most common type of callback (callback), which is often used by proxy-based AOP to implement intercept (intercept) method calls. This interface only defines one method
public Object intercept(Object object, java.lang.reflect.Method method,
Object[] args, MethodProxy proxy) throws Throwable;

The first parameter is the proxy object, and the second and third parameters are the intercepted method and method parameters respectively. The original method may be invoked through normal reflection using a java.lang.reflect.Method object, or using a net.sf.cglib.proxy.MethodProxy object. net.sf.cglib.proxy.MethodProxy is usually preferred to use because it is faster.

package com.proxy.cglib;

import net.sf.cglib.proxy.MethodInterceptor;
import net.sf.cglib.proxy.MethodProxy;
import java.lang.reflect.Method;
 
public class CglibProxy implements MethodInterceptor {
    @Override
    public Object intercept(Object o, Method method, Object[] args, MethodProxy methodProxy) throws Throwable {
        System.out.println("++++++before " + methodProxy.getSuperName() + "++++++");
        System.out.println(method.getName());
        Object o1 = methodProxy.invokeSuper(o, args);
        System.out.println("++++++before " + methodProxy.getSuperName() + "++++++");
        return o1;
    }
}

package com.proxy.cglib;
 
import com.test3.service.UserService;
import com.test3.service.impl.UserServiceImpl;
import net.sf.cglib.proxy.Enhancer;
 
public class Main2 {
    public static void main(String[] args) {
        CglibProxy cglibProxy = new CglibProxy();
 
        Enhancer enhancer = new Enhancer();
        enhancer.setSuperclass(UserServiceImpl.class);
        enhancer.setCallback(cglibProxy);
 
        UserService o = (UserService)enhancer.create();
        o.getName(1);
        o.getAge(1);
    }
}

How to use hashcode and equals

equals() is derived from java.lang.Object, which is used to simply verify the equality of two objects. The default implementation defined in the Object class simply checks the object references of two objects to verify their equality. By overriding this method, you can customize the new rules for verifying that objects are equal. If you use ORM to process some objects, you must make sure to use getters and setters in hashCode() and equals() objects instead of directly referencing member variables.

hashCode() is derived from java.lang.Object, this method is used to obtain the unique integer (hash code) of a given object. When this object needs to be stored in a data structure such as a hash table, this integer is used to determine the location of the bucket. By default, an object's hashCode() method returns an integer representation of the memory address where the object resides. hashCode() is used by HashTable, HashMap and HashSet. By default, the hashCode() method of the Object class returns the number of the memory address where this object is stored.

The hash hash algorithm makes it O(1) to find a record in the hash table. Each record has its own hashcode, and the hash algorithm places the record in the appropriate position according to the hashcode. When looking for a record, first pass Hashcode quickly locates the location of the record. Then use equals to compare whether they are equal. If the hashcode is not found, it will not be equal, and the element does not exist in the hash table; even if it is found, it only needs to perform the equal of several elements with the same hashcode, if not equal, it still does not exist in the hash table.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-7MwOJV0v-1662355491058)(images/HashMap1.7hashcodequals.png)]

The difference between HashMap and HashTable and the underlying implementation

Comparison between HashMap and HashTable

  1. HashTable thread synchronization, HashMap non-thread synchronization.
  2. HashTable does not allow <key, value> to have null values, and HashMap allows <key, value> to have null values.
  3. HashTable uses Enumeration, and HashMap uses Iterator.
  4. The default size of the hash array in HashTable is 11, and the increase method is old*2+1. The default size of the hash array in HashMap is 16, and the increase method is an exponential multiple of 2.

5. HashMap before jdk1.8 list + linked list after jdk1.8 list + linked list, when the length of the linked list reaches 8, it will be converted into a red-black tree

6. The method of inserting nodes into the HashMap linked list In Java 1.7, inserting the linked list nodes uses the header insertion method . In Java 1.8, it becomes the tail insertion method

7. In the hash() of Java1.8, the high bits of the hash value (the first 16 bits) are involved in the modulo calculation, which increases the uncertainty of the calculation results and reduces the probability of hash collisions

HashMap expansion optimization:

After the expansion, 1.7 performs the rehash algorithm on the elements, and calculates the position of each element in the hash table after the expansion. 1.8 With the help of the 2-fold expansion mechanism, the element does not need to recalculate the position

JDK 1.8 does not recalculate the hash value of each element like JDK 1.7 when expanding capacity, but uses high-order operations **(e.hash & oldCap)** to determine whether elements need to be moved. For example, the information of key1 is as follows :

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-3GI9glGx-1662355491058)(images/1621414916379-1621752756248.png)]

The result obtained by using e.hash & oldCap, the upper bit is 0, when the result is 0, it means that the position of the element will not change during expansion, and the key 2 information is as follows

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-bi1yJyfi-1662355491059)(images/1621414931120-1621752756248.png)]

高一位为 1,当结果为 1 时,表示元素在扩容时位置发生了变化,新的下标位置等于原下标位置 + 原数组长度**hashmap,**不必像1.7一样全部重新计算位置

为什么hashmap扩容的时候是两倍?

查看源代码

在存入元素时,放入元素位置有一个 (n-1)&hash 的一个算法,和hash&(newCap-1),这里用到了一个&位运算符

当HashMap的容量是16时,它的二进制是10000,(n-1)的二进制是01111,与hash值得计算结果如下

下面就来看一下HashMap的容量不是2的n次幂的情况,当容量为10时,二进制为01010,(n-1)的二进制是01001,向里面添加同样的元素,结果为

可以看出,有三个不同的元素进过&运算得出了同样的结果,严重的hash碰撞了

只有当n的值是2的N次幂的时候,进行&位运算的时候,才可以只看后几位,而不需要全部进行计算

hashmap线程安全的方式?

HashMap不是线程安全的,往往在写程序时需要通过一些方法来回避.其实JDK原生的提供了2种方法让HashMap支持线程安全.

方法一:通过Collections.synchronizedMap()返回一个新的Map,这个新的map就是线程安全的. 这个要求大家习惯基于接口编程,因为返回的并不是HashMap,而是一个Map的实现.

方法二:重新改写了HashMap,具体的可以查看java.util.concurrent.ConcurrentHashMap. 这个方法比方法一有了很大的改进.

方法一特点:

通过Collections.synchronizedMap()来封装所有不安全的HashMap的方法,就连toString, hashCode都进行了封装. 封装的关键点有2处,1)使用了经典的synchronized来进行互斥, 2)使用了代理模式new了一个新的类,这个类同样实现了Map接口.在Hashmap上面,synchronized锁住的是对象,所以第一个申请的得到锁,其他线程将进入阻塞,等待唤醒. 优点:代码实现十分简单,一看就懂.缺点:从锁的角度来看,方法一直接使用了锁住方法,基本上是锁住了尽可能大的代码块.性能会比较差.

方法二特点:

重新写了HashMap,比较大的改变有如下几点.使用了新的锁机制,把HashMap进行了拆分,拆分成了多个独立的块,这样在高并发的情况下减少了锁冲突的可能,使用的是NonfairSync. 这个特性调用CAS指令来确保原子性与互斥性.当如果多个线程恰好操作到同一个segment上面,那么只会有一个线程得到运行.

优点:需要互斥的代码段比较少,性能会比较好. ConcurrentHashMap把整个Map切分成了多个块,发生锁碰撞的几率大大降低,性能会比较好. 缺点:代码繁琐

Java异常处理方式

Java 通过面向对象的方法进行异常处理,一旦方法抛出异常,系统自动根据该异常对象寻找合适异常处理器(Exception Handler)来处理该异常,把各种不同的异常进行分类,并提供了良好的接口。在 Java 中,每个异常都是一个对

象,它是 Throwable 类或其子类的实例。当一个方法出现异常后便抛出一个异常对象,该对象中包含有异常信息,调用这个对象的方法可以捕获到这个异常并可以对其进行处理。Java 的异常处理是通过 5 个关键词来实现的:try、 catch、throw、throws 和 finally。

在Java应用中,异常的处理机制分为声明异常,抛出异常和捕获异常。

The difference between throw and throws:
(1) The location is different:
throw: inside the method
throws: at the signature of the method, at the declaration of the method

(2) The content is different:
throw + exception object (check exception, runtime exception)
throws + exception type (multiple types can be used, spliced)

(3) The functions are different:
throw: the source of the abnormality, creating an abnormality.
throws: At the declaration of the method, tell the caller of the method that the exceptions I declared may appear in this method. Then the caller handles the exception:
either handle it by itself or continue to throw the exception

1.throws statement exception

In general, you should catch those exceptions that you know how to handle, and pass on the exceptions that you don't know how to handle.

go. Passing exceptions You can use the throws keyword at the method signature to declare the exceptions that may be thrown. Notice

Unchecked exceptions (Error, RuntimeException, or their subclasses) cannot use the throws keyword to declare exceptions to be thrown.

​ If a compile-time exception occurs in a method, it needs try-catch/throws processing, otherwise it will cause a compilation error

2.throw throws an exception

If you feel that some abnormal problems cannot be solved and do not need to be handled by the caller, then you can throw an exception. The throw keyword is used to throw an exception of type Throwable inside the method. Any Java code can throw an exception through the throw statement.

3. trycatch capture exception

The program usually does not report an error before running, but some unknown errors may occur after running, but if you don’t want to throw it directly to the upper level, then you need to catch the exception in the form of try...catch..., and then according to different exceptions The situation is dealt with accordingly. How to choose exception type

You can choose whether to catch an exception, declare an exception or throw an exception according to the following figure

How custom exceptions are applied in production

Although Java provides a wealth of exception handling classes, custom exceptions are often used in projects. The main reason is that the exception classes provided by Java still cannot meet actual needs in some cases. For example, the following situations:
1. Some errors in the system conform to Java syntax, but do not conform to business logic.

2. In a layered software structure, exceptions at other levels of the system are usually captured and processed at the presentation layer.

How to implement an IOC container?

​ IOC (Inversion of Control), which means inversion of control, is not a technology, but a design idea. IOC means handing over your designed object to the container control, instead of the traditional direct control inside your object .

In traditional programming, we directly create objects through new inside the object, and the program actively creates dependent objects, while IOC has a special container to create objects, that is, the IOC container controls the creation of objects.

​ In traditional applications, we actively control objects to directly obtain dependent objects. This is the forward rotation, and the reverse is for the container to help create and inject dependent objects. During this process, the container helps us find the level Inject the dependent object, the object just passively accepts the dependent object.

1. First prepare a basic container object, including a collection of map structures, to facilitate the storage of specific objects in the subsequent process

2. Read the configuration file or analyze the annotation, encapsulate the bean objects that need to be created into BeanDefinition objects and store them in the container

3. The container instantiates the encapsulated BeanDefinition object through reflection to complete the instantiation of the object

4. Perform object initialization operations, that is, set the corresponding attribute values ​​in the class, that is, perform dependency injection, complete the creation of the entire object, and become a complete bean object, which is stored in a map structure of the container

5. Obtain objects through container objects, and perform object acquisition and logical processing

​ 6. Provide a destruction operation, when the object is not used or the container is closed, the useless object will be destroyed

Tell me about your understanding of Spring?

Official website address: https://spring.io/projects/spring-framework#overview

压缩包下载地址:https://repo.spring.io/release/org/springframework/spring/

源码地址:https://github.com/spring-projects/spring-framework

Spring makes it easy to create Java enterprise applications. It provides everything you need to embrace the Java language in an enterprise environment, with support for Groovy and Kotlin as alternative languages on the JVM, and with the flexibility to create many kinds of architectures depending on an application’s needs. As of Spring Framework 5.1, Spring requires JDK 8+ (Java SE 8+) and provides out-of-the-box support for JDK 11 LTS. Java SE 8 update 60 is suggested as the minimum patch release for Java 8, but it is generally recommended to use a recent patch release.

Spring supports a wide range of application scenarios. In a large enterprise, applications often exist for a long time and have to run on a JDK and application server whose upgrade cycle is beyond developer control. Others may run as a single jar with the server embedded, possibly in a cloud environment. Yet others may be standalone applications (such as batch or integration workloads) that do not need a server.

Spring is open source. It has a large and active community that provides continuous feedback based on a diverse range of real-world use cases. This has helped Spring to successfully evolve over a very long time.

Spring 使创建 Java 企业应用程序变得更加容易。它提供了在企业环境中接受 Java 语言所需的一切,,并支持 Groovy 和 Kotlin 作为 JVM 上的替代语言,并可根据应用程序的需要灵活地创建多种体系结构。 从 Spring Framework 5.0 开始,Spring 需要 JDK 8(Java SE 8+),并且已经为 JDK 9 提供了现成的支持。

Spring支持各种应用场景, 在大型企业中, 应用程序通常需要运行很长时间,而且必须运行在 jdk 和应用服务器上,这种场景开发人员无法控制其升级周期。 其他可能作为一个单独的jar嵌入到服务器去运行,也有可能在云环境中。还有一些可能是不需要服务器的独立应用程序(如批处理或集成的工作任务)。

Spring 是开源的。它拥有一个庞大而且活跃的社区,提供不同范围的,真实用户的持续反馈。这也帮助Spring不断地改进,不断发展。

你觉得Spring的核心是什么?

​ spring是一个开源框架。

​ spring是为了简化企业开发而生的,使得开发变得更加优雅和简洁。

​ spring是一个IOCAOP的容器框架。

​ IOC:控制反转

​ AOP:面向切面编程

​ 容器:包含并管理应用对象的生命周期,就好比用桶装水一样,spring就是桶,而对象就是水

说一下使用spring的优势?

​ 1、Spring通过DI、AOP和消除样板式代码来简化企业级Java开发

​ 2、Spring框架之外还存在一个构建在核心框架之上的庞大生态圈,它将Spring扩展到不同的领域,如Web服务、REST、移动开发以及NoSQL

​ 3、低侵入式设计,代码的污染极低

​ 4、独立于各种应用服务器,基于Spring框架的应用,可以真正实现Write Once,Run Anywhere的承诺

​ 5、Spring的IoC容器降低了业务对象替换的复杂性,提高了组件之间的解耦

​ 6、Spring的AOP支持允许将一些通用任务如安全、事务、日志等进行集中式处理,从而提供了更好的复用

​ 7、Spring的ORM和DAO提供了与第三方持久层框架的的良好整合,并简化了底层的数据库访问

​ 8、Spring的高度开放性,并不强制应用完全依赖于Spring,开发者可自由选用Spring框架的部分或全部

Spring是如何简化开发的?

​ 基于POJO的轻量级和最小侵入性编程

​ 通过依赖注入和面向接口实现松耦合

​ 基于切面和惯例进行声明式编程

​ 通过切面和模板减少样板式代码

说说你对Aop的理解?

​ The full name of AOP is Aspect Oriented Programming. It was born for decoupling. Decoupling is the state that programmers have been pursuing in the process of coding and development. AOP definitely achieves decoupling in the isolation of business classes. There are several core concepts in it:

  • Aspect: Refers to the modularization of concerns, which may cross-cut multiple objects. Transaction management is an example of cross-cutting concerns in enterprise Java applications. In Spring AOP, aspects can be implemented using a generic class-based schema (schema-based approach) or annotations in ordinary classes @Aspect(@AspectJ annotation).

  • Join point: A specific point in the execution of a program, such as when a method is called or when an exception is handled. In Spring AOP, a join point always represents a method execution.

  • Advice: An action performed on a specific join point of an aspect. There are several types of notifications, including "around", "before" and "after" to name a few. The types of notifications are discussed in later chapters. Many AOP frameworks, including Spring, use interceptors as notification models and maintain a chain of interceptors centered around join points.

  • Pointcut: An assertion that matches a join point. Advice is associated with a pointcut expression and operates on join points that satisfy this pointcut (for example, when a method of a particular name is executed). How pointcut expressions match join points is at the heart of AOP: Spring uses AspectJ pointcut semantics by default.

  • Introduction: Declare additional methods or fields of a type. Spring allows introducing new interfaces (and a corresponding implementation) to any advised object. For example, you can use imports to make beans implement IsModifiedinterfaces in order to simplify caching mechanisms (in the AspectJ community, imports are also called internal type declarations (inter)).

  • Target object: The object to be advised by one or more aspects. Also known as the advised object. Since Spring AOP is implemented through a runtime proxy, this object is always a proxied object.

  • AOP proxy (AOP proxy): The object created by the AOP framework is used to implement the aspect contract (including functions such as notification method execution). In Spring, AOP proxies can be JDK dynamic proxies or CGLIB proxies.

  • Weaving: The process of connecting aspects to other application types or objects and creating an object to be advised. This process can be done at compile time (eg using the AspectJ compiler), classload time or runtime. Spring, like other pure Java AOP frameworks, is weaved at runtime.

    These concepts are too academic. If you explain it more simply, it is actually very simple:

    Any system is composed of different components, and each component is responsible for a specific function. Of course, there will be many components that have nothing to do with business, such as core service components such as logs, transactions, and permissions. These core service components are often integrated into In the specific business logic, if we add such codes for each specific business logic operation, it is obvious that the code is too redundant, so we need to abstract these common code logics into an aspect, and then inject them into the target object (Specific business), AOP is implemented based on such an idea. Through dynamic proxy, the object that needs to be injected into the aspect is proxied, and when the call is made, the public logic is directly added to it, without the need To modify the logic code of the original business, you only need to make some enhancements on the basis of the original business logic.

Tell me about your understanding of IOC?

	IoC is also known as dependency injection (DI). It is a process whereby objects define their dependencies (that is, the other objects they work with) only through constructor arguments, arguments to a factory method, or properties that are set on the object instance after it is constructed or returned from a factory method. The container then injects those dependencies when it creates the bean. This process is fundamentally the inverse (hence the name, Inversion of Control) of the bean itself controlling the instantiation or location of its dependencies by using direct construction of classes or a mechanism such as the Service Locator pattern.
	IOC与大家熟知的依赖注入同理,. 这是一个通过依赖注入对象的过程 也就是说,它们所使用的对象,是通过构造函数参数,工厂方法的参数或这是从工厂方法的构造函数或返回值的对象实例设置的属性,然后容器在创建bean时注入这些需要的依赖。 这个过程相对普通创建对象的过程是反向的(因此称之为IoC),bean本身通过直接构造类来控制依赖关系的实例化或位置,或提供诸如服务定位器模式之类的机制。

​ If this process is difficult to understand, then you can imagine the process of finding a girlfriend by yourself and a matchmaking company. If this process can be understood, then we now answer the above questions:

1、谁控制谁:在之前的编码过程中,都是需要什么对象自己去创建什么对象,有程序员自己来控制对象,而有了IOC容器之后,就会变成由IOC容器来控制对象,
2、控制什么:在实现过程中所需要的对象及需要依赖的对象
3、什么是反转:在没有IOC容器之前我们都是在对象中主动去创建依赖的对象,这是正转的,而有了IOC之后,依赖的对象直接由IOC容器创建后注入到对象中,由主动创建变成了被动接受,这是反转
4、哪些方面被反转:依赖的对象

What is the difference between BeanFactory and ApplicationContext

same:

  • Spring provides two different IOC containers, one is BeanFactory and the other is ApplicationContext, both of which are Java interfaces. ApplicationContext inherits from BeanFactory (ApplicationContext inherits from ListableBeanFactory.
  • Both of them can be used to configure XML attributes, and also support automatic injection of attributes.
  • And ListableBeanFactory inherits BeanFactory), BeanFactory and ApplicationContext both provide a way to get beans using getBean("bean name").

different:

  • When you call the getBean() method, the BeanFactory only instantiates the bean, while the ApplicationContext instantiates the singleton bean when the container is started, and does not wait for the getBean() method to be instantiated.
  • BeanFactory does not support internationalization, i.e. i18n, but ApplicationContext provides support for it.
  • Another difference between BeanFactory and ApplicationContext is the ability to publish events to beans registered as listeners.
  • A core implementation of BeanFactory is XMLBeanFactory and a core implementation of ApplicationContext is ClassPathXmlApplicationContext. We use WebApplicationContext and add getServletContext method in the environment of Web container.
  • If you use automatic injection and use BeanFactory, you need to register AutoWiredBeanPostProcessor with API, if you use ApplicationContext, you can use XML for configuration.
  • In short, BeanFactory provides basic IOC and DI functions, while ApplicationContext provides advanced functions, BeanFactory can be used for testing and non-production use, but ApplicationContext is a richer container implementation and should be better than BeanFactory

Briefly describe the life cycle of spring beans?

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-N4JDWJyP-1662355491059)(images/bean lifecycle.png)]

1. Instantiate the bean object

​ Objects are created through reflection. At this time, the creation is just to apply for space in the heap space, and the attributes are all default values.

2. Set object properties

​ Set the value of the attribute in the object

3. Check Aware related interfaces and set related dependencies

​ If the object needs to refer to the object inside the container, then it is necessary to call the subclass method of the aware interface to perform unified settings

4. Pre-processing of BeanPostProcessor

​ Perform pre-processing work on the generated bean object

5. Check whether it is a subclass of InitializingBean to decide whether to call the afterPropertiesSet method

​ Determine whether the current bean object is set with the InitializingBean interface, and then perform basic work such as property setting

6. Check whether there is a custom init-method configured

​ If the current bean object defines an initialization method, then call the initialization method here

7. BeanPostProcessor post-processing

​ Perform post-processing work on the generated bean object

8. Register the necessary Destruction-related callback interfaces

​ In order to facilitate the destruction of the object, the callback interface for cancellation is called here to facilitate the destruction of the object

9. Get and use the bean object

​ Obtain objects through containers and use them

10. Whether to implement the DisposableBean interface

​ Determine whether the DisposableBean interface is implemented, and call the specific method to destroy the object

11. Is there a custom destroy method configured?

​ If the current bean object defines a destruction method, then call the destruction method here

What are the bean scopes supported by spring?

① singleton

When using this attribute to define a bean, the IOC container only creates one bean instance, and the IOC container returns the same bean instance each time.

② prototype

When using this attribute to define a bean, the IOC container can create multiple bean instances, each returning a new instance.

③ request

This attribute only works on HTTP requests. When using this attribute to define a bean, each HTTP request will create a new bean, which is suitable for the WebApplicationContext environment.

④ session

This attribute is only used for HTTP Session, and the same Session shares a Bean instance. Different Sessions use different instances.

⑤ global-session

This attribute is only used for HTTP Session. Unlike the session scope, all Sessions share a Bean instance.

Are singleton beans in the Spring framework thread-safe?

Bean objects in Spring are singletons by default, and the framework does not perform multi-threaded encapsulation of beans

​ If the Bean is stateful, then the developer needs to ensure thread safety. The easiest way is to change the scope of the bean and change the singleton to prototype, so that each request for the bean object is equivalent to creating a new object. to ensure thread safety

​ Stateful is the function stored by the data

​ Stateless means that data will not be stored. If you think about it, our controller, service, and dao are not thread-safe, they just call the methods inside, and the method of calling an instance by multiple threads will be copied and traversed in memory. It is the working memory of its own thread, which is the safest.

​ Therefore, when using it, do not declare any stateful instance variables or class variables in the bean. If it is necessary, it is also recommended that you use ThreadLocal to make the variable private to the thread. If the instance variable or class variable of the bean needs to be in multiple If shared between multiple threads, then you can only use synchronized, lock, cas and other methods to achieve thread synchronization.

What design patterns and application scenarios are used in the spring framework

1. Factory mode, used in various BeanFactory and ApplicationContext creation

2. The template mode is also used in various BeanFactory and ApplicationContext implementations

3. Proxy mode, Spring AOP uses AspectJ AOP to achieve! The bottom layer of AspectJ AOP uses dynamic proxy

​ 4. Strategy mode, the way of loading resource files, using different methods, such as: ClassPathResourcece, FileSystemResource, ServletContextResource, UrlResource but they all have a common excuse Resource; in the implementation of Aop, two different methods are used, JDK dynamic proxy and CGLIB proxy

5. Singleton mode, such as when creating beans.

6. Observer mode, ApplicationEvent in spring, ApplicationListener, ApplicationEventPublisher

7. Adapter mode, MethodBeforeAdviceAdapter, ThrowsAdviceAdapter, AfterReturningAdapter

8. Decorator mode, the type with Wrapper or Decorator in the source code is all

What is the principle of the implementation of spring transactions?

​ When using the Spring framework, there are two ways to implement transactions, one is programmatic transactions, where users control the processing logic of transactions through code, and the other is declarative transactions, which are implemented through @Transactional annotations .

​ In fact, the operation of the transaction should be controlled by the database, but in order to facilitate the user to operate the business logic, spring has extended the implementation of the transaction function. Generally, we rarely use programmatic transactions, and more by adding @ Transactional annotation is used for implementation. When this annotation is added, the automatic function of the transaction will be closed, and there is a spring framework to help control it.

​ 其实事务操作是AOP的一个核心体现,当一个方法添加@Transactional注解之后,spring会基于这个类生成一个代理对象,会将这个代理对象作为bean,当使用这个代理对象的方法的时候,如果有事务处理,那么会先把事务的自动提交给关系,然后去执行具体的业务逻辑,如果执行逻辑没有出现异常,那么代理逻辑就会直接提交,如果出现任何异常情况,那么直接进行回滚操作,当然用户可以控制对哪些异常进行回滚操作。

TransactionInterceptor

spring事务的隔离级别有哪些?

​ spring中的事务隔离级别就是数据库的隔离级别,有以下几种:

​ read uncommitted

​ read committed

​ repeatable read

​ serializable

​ 在进行配置的时候,如果数据库和spring代码中的隔离级别不同,那么以spring的配置为主。

spring的事务传播机制是什么?

​ 多个事务方法相互调用时,事务如何在这些方法之间进行传播,spring中提供了7中不同的传播特性,来保证事务的正常执行:

​ REQUIRED:默认的传播特性,如果当前没有事务,则新建一个事务,如果当前存在事务,则加入这个事务

​ SUPPORTS:当前存在事务,则加入当前事务,如果当前没有事务,则以非事务的方式执行

​ MANDATORY:当前存在事务,则加入当前事务,如果当前事务不存在,则抛出异常

​ REQUIRED_NEW:创建一个新事务,如果存在当前事务,则挂起改事务

​ NOT_SUPPORTED:以非事务方式执行,如果存在当前事务,则挂起当前事务

​ NEVER:不使用事务,如果当前事务存在,则抛出异常

​ NESTED:如果当前事务存在,则在嵌套事务中执行,否则REQUIRED的操作一样

​ NESTED和REQUIRED_NEW的区别:

​ REQUIRED_NEW is to create a new transaction and the newly started transaction has nothing to do with the original transaction, while NESTED is to start a nested transaction when there is a current transaction. In the case of NESTED, when the parent transaction rolls back, the child transaction will also roll back , and in the case of REQUIRED_NEW, the original transaction is rolled back without affecting the newly opened transaction

​ The difference between NESTED and REQUIRED:

​ In the case of REQUIRED, when the caller has a transaction, the callee and the caller use the same transaction, and when the callee has an exception, since the same transaction is shared, the transaction will be rolled back regardless of whether the exception is caught or not. In NESTED In some cases, when an exception occurs in the callee, the caller can catch the exception, so that only the sub-transaction is rolled back, and the parent transaction will not be rolled back.

When will the spring transaction fail?

1. The bean object is not managed by the spring container

2. The access modifier of the method is not public

3. Self-calling problem

4. The data source is not configured with a transaction manager

5. The database does not support transactions

6. The exception is caught

7. Exception type error or configuration error

What is bean autowiring and what are its methods?

​ Bean auto-assembly means that the bean’s attribute value is searched in the container through a specific rule and method when injecting, and set to a specific object attribute. There are five main methods:

​ no – By default, auto-configuration is manually set through the “ref” attribute, most commonly used in projects
byName – auto-assembly based on the property name. If a bean's name is the same as another bean's property name, it will be self-wired.
​ byType – Autowire by data type. If the data type of the bean is the data type of other bean properties, it is compatible and automatically assembled.
​ constructor – in the byType method of the constructor parameter.
autodetect – If a default constructor is found, use "Constructor for autowiring"; otherwise, use "Autowire by type".

What is the difference between spring, springmvc, and springboot?

​ spring和springMvc:

  1. spring是一个一站式的轻量级的java开发框架,核心是控制反转(IOC)和面向切面(AOP),针对于开发的WEB层(springMvc)、业务层(Ioc)、持久层(jdbcTemplate)等都提供了多种配置解决方案;

  2. springMvc是spring基础之上的一个MVC框架,主要处理web开发的路径映射和视图渲染,属于spring框架中WEB层开发的一部分;

springMvc和springBoot:

1、springMvc属于一个企业WEB开发的MVC框架,涵盖面包括前端视图开发、文件配置、后台接口逻辑开发等,XML、config等配置相对比较繁琐复杂;

2、springBoot框架相对于springMvc框架来说,更专注于开发微服务后台接口,不开发前端视图,同时遵循默认优于配置,简化了插件配置流程,不需要配置xml,相对springmvc,大大简化了配置流程;

总结:

1、Spring 框架就像一个家族,有众多衍生产品例如 boot、security、jpa等等。但他们的基础都是Spring的ioc、aop等. ioc 提供了依赖注入的容器, aop解决了面向横切面编程,然后在此两者的基础上实现了其他延伸产品的高级功能;

2、springMvc主要解决WEB开发的问题,是基于Servlet 的一个MVC框架,通过XML配置,统一开发前端视图和后端逻辑;

3、由于Spring的配置非常复杂,各种XML、JavaConfig、servlet处理起来比较繁琐,为了简化开发者的使用,从而创造性地推出了springBoot框架,默认优于配置,简化了springMvc的配置流程;但区别于springMvc的是,springBoot专注于单体微服务接口开发,和前端解耦,虽然springBoot也可以做成springMvc前后台一起开发,但是这就有点不符合springBoot框架的初衷了;

springmvc工作流程是什么?

​ When a request is initiated, the request is intercepted by the front-end controller, a proxy request is generated according to the request parameters, the actual controller corresponding to the request is found, the controller processes the request, creates a data model, accesses the database, and responds the model to the central controller. The controller uses the model and view to render the view results, returns the results to the central controller, and then returns the results to the requester.

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-iqBbFVyG-1662355491059)(images/springmvc running process.jpg)]

1. DispatcherServlet represents the front controller and is the control center of the entire SpringMVC. The user makes a request, and DispatcherServlet receives the request and intercepts the request.
2. HandlerMapping is the processor mapping. DispatcherServlet calls HandlerMapping, and HandlerMapping looks for Handler according to the request url.
3. Return to the processor execution chain, find the controller according to the url, and pass the parsed information to DispatcherServlet
4. HandlerAdapter represents the processor adapter, which executes the Handler according to specific rules.
5. Execute the handler to find the specific processor
6. The Controller returns the specific execution information to the HandlerAdapter, such as ModelAndView.
7. The HandlerAdapter passes the view logical name or model to the DispatcherServlet.
8. DispatcherServlet calls the view resolver (ViewResolver) to resolve the logical view name passed by HandlerAdapter.
9. The view parser passes the parsed logical view name to DispatcherServlet.
10. DispatcherServlet invokes the specific view according to the view result parsed by the view resolver, and attempts to render.
11. Returns the response data to the client

What are the nine components of springmvc?

1.HandlerMapping
finds the corresponding processor according to the request. Because Handler (Controller) has two forms, one is class-based Handler, and the other is Method-based Handler (which is what we commonly use)

2.HandlerAdapter
调用Handler的适配器。如果把Handler(Controller)当做工具的话,那么HandlerAdapter就相当于干活的工人

3.HandlerExceptionResolver
对异常的处理

4.ViewResolver
用来将String类型的视图名和Locale解析为View类型的视图

5.RequestToViewNameTranslator
有的Handler(Controller)处理完后没有设置返回类型,比如是void方法,这是就需要从request中获取viewName

6.LocaleResolver
从request中解析出Locale。Locale表示一个区域,比如zh-cn,对不同的区域的用户,显示不同的结果,这就是i18n(SpringMVC中有具体的拦截器LocaleChangeInterceptor)

7.ThemeResolver
主题解析,这种类似于我们手机更换主题,不同的UI,css等

8.MultipartResolver
处理上传请求,将普通的request封装成MultipartHttpServletRequest

9.FlashMapManager
用于管理FlashMap,FlashMap用于在redirect重定向中传递参数

springboot自动配置原理是什么?

在之前的课程中我们讲解了springboot的启动过程,其实在面试过程中问的最多的可能是自动装配的原理,而自动装配是在启动过程中完成,只不过在刚开始的时候我们选择性的跳过了,下面详细讲解自动装配的过程。

1. In the startup process of springboot, there is a step to create a context. If you don’t remember, you can see the following code:

public ConfigurableApplicationContext run(String... args) {
    
    
		StopWatch stopWatch = new StopWatch();
		stopWatch.start();
		ConfigurableApplicationContext context = null;
		Collection<SpringBootExceptionReporter> exceptionReporters = new ArrayList<>();
		configureHeadlessProperty();
		SpringApplicationRunListeners listeners = getRunListeners(args);
		listeners.starting();
		try {
    
    
			ApplicationArguments applicationArguments = new DefaultApplicationArguments(args);
			ConfigurableEnvironment environment = prepareEnvironment(listeners, applicationArguments);
			configureIgnoreBeanInfo(environment);
			Banner printedBanner = printBanner(environment);
			context = createApplicationContext();
			exceptionReporters = getSpringFactoriesInstances(SpringBootExceptionReporter.class,
					new Class[] {
    
     ConfigurableApplicationContext.class }, context);
            //此处完成自动装配的过程
			prepareContext(context, environment, listeners, applicationArguments, printedBanner);
			refreshContext(context);
			afterRefresh(context, applicationArguments);
			stopWatch.stop();
			if (this.logStartupInfo) {
    
    
				new StartupInfoLogger(this.mainApplicationClass).logStarted(getApplicationLog(), stopWatch);
			}
			listeners.started(context);
			callRunners(context, applicationArguments);
		}
		catch (Throwable ex) {
    
    
			handleRunFailure(context, ex, exceptionReporters, listeners);
			throw new IllegalStateException(ex);
		}

		try {
    
    
			listeners.running(context);
		}
		catch (Throwable ex) {
    
    
			handleRunFailure(context, ex, exceptionReporters, null);
			throw new IllegalStateException(ex);
		}
		return context;
	}

2. Find the load method in the prepareContext method, click inward layer by layer, and find the final load method

//prepareContext方法
	private void prepareContext(ConfigurableApplicationContext context, ConfigurableEnvironment environment,
			SpringApplicationRunListeners listeners, ApplicationArguments applicationArguments, Banner printedBanner) {
    
    
		context.setEnvironment(environment);
		postProcessApplicationContext(context);
		applyInitializers(context);
		listeners.contextPrepared(context);
		if (this.logStartupInfo) {
    
    
			logStartupInfo(context.getParent() == null);
			logStartupProfileInfo(context);
		}
		// Add boot specific singleton beans
		ConfigurableListableBeanFactory beanFactory = context.getBeanFactory();
		beanFactory.registerSingleton("springApplicationArguments", applicationArguments);
		if (printedBanner != null) {
    
    
			beanFactory.registerSingleton("springBootBanner", printedBanner);
		}
		if (beanFactory instanceof DefaultListableBeanFactory) {
    
    
			((DefaultListableBeanFactory) beanFactory)
					.setAllowBeanDefinitionOverriding(this.allowBeanDefinitionOverriding);
		}
		if (this.lazyInitialization) {
    
    
			context.addBeanFactoryPostProcessor(new LazyInitializationBeanFactoryPostProcessor());
		}
		// Load the sources
		Set<Object> sources = getAllSources();
		Assert.notEmpty(sources, "Sources must not be empty");
        //load方法完成该功能
		load(context, sources.toArray(new Object[0]));
		listeners.contextLoaded(context);
	}


	/**
	 * Load beans into the application context.
	 * @param context the context to load beans into
	 * @param sources the sources to load
	 * 加载bean对象到context中
	 */
	protected void load(ApplicationContext context, Object[] sources) {
    
    
		if (logger.isDebugEnabled()) {
    
    
			logger.debug("Loading source " + StringUtils.arrayToCommaDelimitedString(sources));
		}
        //获取bean对象定义的加载器
		BeanDefinitionLoader loader = createBeanDefinitionLoader(getBeanDefinitionRegistry(context), sources);
		if (this.beanNameGenerator != null) {
    
    
			loader.setBeanNameGenerator(this.beanNameGenerator);
		}
		if (this.resourceLoader != null) {
    
    
			loader.setResourceLoader(this.resourceLoader);
		}
		if (this.environment != null) {
    
    
			loader.setEnvironment(this.environment);
		}
		loader.load();
	}

	/**
	 * Load the sources into the reader.
	 * @return the number of loaded beans
	 */
	int load() {
    
    
		int count = 0;
		for (Object source : this.sources) {
    
    
			count += load(source);
		}
		return count;
	}

3. It is the load method in BeanDefinitionLoader that actually executes the load, as follows:

	//实际记载bean的方法
	private int load(Object source) {
    
    
		Assert.notNull(source, "Source must not be null");
        //如果是class类型,启用注解类型
		if (source instanceof Class<?>) {
    
    
			return load((Class<?>) source);
		}
        //如果是resource类型,启动xml解析
		if (source instanceof Resource) {
    
    
			return load((Resource) source);
		}
        //如果是package类型,启用扫描包,例如@ComponentScan
		if (source instanceof Package) {
    
    
			return load((Package) source);
		}
        //如果是字符串类型,直接加载
		if (source instanceof CharSequence) {
    
    
			return load((CharSequence) source);
		}
		throw new IllegalArgumentException("Invalid source type " + source.getClass());
	}

4. The following method will be used to determine whether the type of resource is loaded by groovy or by annotation

	private int load(Class<?> source) {
    
    
        //判断使用groovy脚本
		if (isGroovyPresent() && GroovyBeanDefinitionSource.class.isAssignableFrom(source)) {
    
    
			// Any GroovyLoaders added in beans{} DSL can contribute beans here
			GroovyBeanDefinitionSource loader = BeanUtils.instantiateClass(source, GroovyBeanDefinitionSource.class);
			load(loader);
		}
        //使用注解加载
		if (isComponent(source)) {
    
    
			this.annotatedReader.register(source);
			return 1;
		}
		return 0;
	}

5. The following method determines whether the startup class contains the @Component annotation, but it will magically find that there is no such annotation in our startup class. Continue to find that the MergedAnnotations class has passed in a parameter SearchStrategy.TYPE_HIERARCHY, and will check whether the inheritance relationship is Including this annotation, @SpringBootApplication–>@SpringBootConfiguration–>@Configuration–>@Component, when the @Component annotation is found, the object will be registered in the AnnotatedBeanDefinitionReader object

private boolean isComponent(Class<?> type) {
    
    
   // This has to be a bit of a guess. The only way to be sure that this type is
   // eligible is to make a bean definition out of it and try to instantiate it.
   if (MergedAnnotations.from(type, SearchStrategy.TYPE_HIERARCHY).isPresent(Component.class)) {
    
    
      return true;
   }
   // Nested anonymous classes are not eligible for registration, nor are groovy
   // closures
   return !type.getName().matches(".*\\$_.*closure.*") && !type.isAnonymousClass()
         && type.getConstructors() != null && type.getConstructors().length != 0;
}

	/**
	 * Register a bean from the given bean class, deriving its metadata from
	 * class-declared annotations.
	 * 从给定的bean class中注册一个bean对象,从注解中找到相关的元数据
	 */
	private <T> void doRegisterBean(Class<T> beanClass, @Nullable String name,
			@Nullable Class<? extends Annotation>[] qualifiers, @Nullable Supplier<T> supplier,
			@Nullable BeanDefinitionCustomizer[] customizers) {
    
    

		AnnotatedGenericBeanDefinition abd = new AnnotatedGenericBeanDefinition(beanClass);
		if (this.conditionEvaluator.shouldSkip(abd.getMetadata())) {
    
    
			return;
		}

		abd.setInstanceSupplier(supplier);
		ScopeMetadata scopeMetadata = this.scopeMetadataResolver.resolveScopeMetadata(abd);
		abd.setScope(scopeMetadata.getScopeName());
		String beanName = (name != null ? name : this.beanNameGenerator.generateBeanName(abd, this.registry));

		AnnotationConfigUtils.processCommonDefinitionAnnotations(abd);
		if (qualifiers != null) {
    
    
			for (Class<? extends Annotation> qualifier : qualifiers) {
    
    
				if (Primary.class == qualifier) {
    
    
					abd.setPrimary(true);
				}
				else if (Lazy.class == qualifier) {
    
    
					abd.setLazyInit(true);
				}
				else {
    
    
					abd.addQualifier(new AutowireCandidateQualifier(qualifier));
				}
			}
		}
		if (customizers != null) {
    
    
			for (BeanDefinitionCustomizer customizer : customizers) {
    
    
				customizer.customize(abd);
			}
		}

		BeanDefinitionHolder definitionHolder = new BeanDefinitionHolder(abd, beanName);
		definitionHolder = AnnotationConfigUtils.applyScopedProxyMode(scopeMetadata, definitionHolder, this.registry);
		BeanDefinitionReaderUtils.registerBeanDefinition(definitionHolder, this.registry);
	}

	/**
	 * Register the given bean definition with the given bean factory.
	 * 注册主类,如果有别名可以设置别名
	 */
	public static void registerBeanDefinition(
			BeanDefinitionHolder definitionHolder, BeanDefinitionRegistry registry)
			throws BeanDefinitionStoreException {
    
    

		// Register bean definition under primary name.
		String beanName = definitionHolder.getBeanName();
		registry.registerBeanDefinition(beanName, definitionHolder.getBeanDefinition());

		// Register aliases for bean name, if any.
		String[] aliases = definitionHolder.getAliases();
		if (aliases != null) {
    
    
			for (String alias : aliases) {
    
    
				registry.registerAlias(beanName, alias);
			}
		}
	}

//@SpringBootApplication
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@SpringBootConfiguration
@EnableAutoConfiguration
@ComponentScan(excludeFilters = {
    
     @Filter(type = FilterType.CUSTOM, classes = TypeExcludeFilter.class),
		@Filter(type = FilterType.CUSTOM, classes = AutoConfigurationExcludeFilter.class) })
public @interface SpringBootApplication {
    
    }

//@SpringBootConfiguration
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Configuration
public @interface SpringBootConfiguration {
    
    }

//@Configuration
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Component
public @interface Configuration {
    
    }

After reading the above code, only the injection of the startup object has been completed, and the automatic assembly has not yet started, so let's start to enter the automatic assembly.

6. Automatically assemble the entry, starting from refreshing the container

@Override
	public void refresh() throws BeansException, IllegalStateException {
    
    
		synchronized (this.startupShutdownMonitor) {
    
    
			// Prepare this context for refreshing.
			prepareRefresh();

			// Tell the subclass to refresh the internal bean factory.
			ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();

			// Prepare the bean factory for use in this context.
			prepareBeanFactory(beanFactory);

			try {
    
    
				// Allows post-processing of the bean factory in context subclasses.
				postProcessBeanFactory(beanFactory);

				// Invoke factory processors registered as beans in the context.
                // 此处是自动装配的入口
				invokeBeanFactoryPostProcessors(beanFactory);
            }

7. Complete the instantiation and execution of the bean in the invokeBeanFactoryPostProcessors method

/**
	 * Instantiate and invoke all registered BeanFactoryPostProcessor beans,
	 * respecting explicit order if given.
	 * <p>Must be called before singleton instantiation.
	 */
	protected void invokeBeanFactoryPostProcessors(ConfigurableListableBeanFactory beanFactory) {
    
    
        //开始执行beanFactoryPostProcessor对应实现类,需要知道的是beanFactoryPostProcessor是spring的扩展接口,在刷新容器之前,该接口可以用来修改bean元数据信息
		PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(beanFactory, getBeanFactoryPostProcessors());

		// Detect a LoadTimeWeaver and prepare for weaving, if found in the meantime
		// (e.g. through an @Bean method registered by ConfigurationClassPostProcessor)
		if (beanFactory.getTempClassLoader() == null && beanFactory.containsBean(LOAD_TIME_WEAVER_BEAN_NAME)) {
    
    
			beanFactory.addBeanPostProcessor(new LoadTimeWeaverAwareProcessor(beanFactory));
			beanFactory.setTempClassLoader(new ContextTypeMatchClassLoader(beanFactory.getBeanClassLoader()));
		}
	}

8. View the specific execution method of invokeBeanFactoryPostProcessors

	public static void invokeBeanFactoryPostProcessors(
			ConfigurableListableBeanFactory beanFactory, List<BeanFactoryPostProcessor> beanFactoryPostProcessors) {
    
    

		// Invoke BeanDefinitionRegistryPostProcessors first, if any.
		Set<String> processedBeans = new HashSet<>();

		if (beanFactory instanceof BeanDefinitionRegistry) {
    
    
			BeanDefinitionRegistry registry = (BeanDefinitionRegistry) beanFactory;
			List<BeanFactoryPostProcessor> regularPostProcessors = new ArrayList<>();
			List<BeanDefinitionRegistryPostProcessor> registryProcessors = new ArrayList<>();
			//开始遍历三个内部类,如果属于BeanDefinitionRegistryPostProcessor子类,加入到bean注册的集合,否则加入到regularPostProcessors
			for (BeanFactoryPostProcessor postProcessor : beanFactoryPostProcessors) {
    
    
				if (postProcessor instanceof BeanDefinitionRegistryPostProcessor) {
    
    
					BeanDefinitionRegistryPostProcessor registryProcessor =
							(BeanDefinitionRegistryPostProcessor) postProcessor;
					registryProcessor.postProcessBeanDefinitionRegistry(registry);
					registryProcessors.add(registryProcessor);
				}
				else {
    
    
					regularPostProcessors.add(postProcessor);
				}
			}

			// Do not initialize FactoryBeans here: We need to leave all regular beans
			// uninitialized to let the bean factory post-processors apply to them!
			// Separate between BeanDefinitionRegistryPostProcessors that implement
			// PriorityOrdered, Ordered, and the rest.
			List<BeanDefinitionRegistryPostProcessor> currentRegistryProcessors = new ArrayList<>();

			// First, invoke the BeanDefinitionRegistryPostProcessors that implement PriorityOrdered.
            //通过BeanDefinitionRegistryPostProcessor获取到对应的处理类“org.springframework.context.annotation.internalConfigurationAnnotationProcessor”,但是需要注意的是这个类在springboot中搜索不到,这个类的完全限定名在AnnotationConfigEmbeddedWebApplicationContext中,在进行初始化的时候会装配几个类,在创建AnnotatedBeanDefinitionReader对象的时候会将该类注册到bean对象中,此处可以看到internalConfigurationAnnotationProcessor为bean名称,容器中真正的类是ConfigurationClassPostProcessor
			String[] postProcessorNames =
					beanFactory.getBeanNamesForType(BeanDefinitionRegistryPostProcessor.class, true, false);
            //首先执行类型为PriorityOrdered的BeanDefinitionRegistryPostProcessor
            //PriorityOrdered类型表明为优先执行
			for (String ppName : postProcessorNames) {
    
    
				if (beanFactory.isTypeMatch(ppName, PriorityOrdered.class)) {
    
    
                    //获取对应的bean
					currentRegistryProcessors.add(beanFactory.getBean(ppName, BeanDefinitionRegistryPostProcessor.class));
                    //用来存储已经执行过的BeanDefinitionRegistryPostProcessor
					processedBeans.add(ppName);
				}
			}
			sortPostProcessors(currentRegistryProcessors, beanFactory);
			registryProcessors.addAll(currentRegistryProcessors);
            //开始执行装配逻辑
			invokeBeanDefinitionRegistryPostProcessors(currentRegistryProcessors, registry);
			currentRegistryProcessors.clear();

			// Next, invoke the BeanDefinitionRegistryPostProcessors that implement Ordered.
            //其次执行类型为Ordered的BeanDefinitionRegistryPostProcessor
            //Ordered表明按顺序执行
			postProcessorNames = beanFactory.getBeanNamesForType(BeanDefinitionRegistryPostProcessor.class, true, false);
			for (String ppName : postProcessorNames) {
    
    
				if (!processedBeans.contains(ppName) && beanFactory.isTypeMatch(ppName, Ordered.class)) {
    
    
					currentRegistryProcessors.add(beanFactory.getBean(ppName, BeanDefinitionRegistryPostProcessor.class));
					processedBeans.add(ppName);
				}
			}
			sortPostProcessors(currentRegistryProcessors, beanFactory);
			registryProcessors.addAll(currentRegistryProcessors);
			invokeBeanDefinitionRegistryPostProcessors(currentRegistryProcessors, registry);
			currentRegistryProcessors.clear();

			// Finally, invoke all other BeanDefinitionRegistryPostProcessors until no further ones appear.
            //循环中执行类型不为PriorityOrdered,Ordered类型的BeanDefinitionRegistryPostProcessor
			boolean reiterate = true;
			while (reiterate) {
    
    
				reiterate = false;
				postProcessorNames = beanFactory.getBeanNamesForType(BeanDefinitionRegistryPostProcessor.class, true, false);
				for (String ppName : postProcessorNames) {
    
    
					if (!processedBeans.contains(ppName)) {
    
    
						currentRegistryProcessors.add(beanFactory.getBean(ppName, BeanDefinitionRegistryPostProcessor.class));
						processedBeans.add(ppName);
						reiterate = true;
					}
				}
				sortPostProcessors(currentRegistryProcessors, beanFactory);
				registryProcessors.addAll(currentRegistryProcessors);
				invokeBeanDefinitionRegistryPostProcessors(currentRegistryProcessors, registry);
				currentRegistryProcessors.clear();
			}

			// Now, invoke the postProcessBeanFactory callback of all processors handled so far.	
            //执行父类方法,优先执行注册处理类
			invokeBeanFactoryPostProcessors(registryProcessors, beanFactory);
            //执行有规则处理类
			invokeBeanFactoryPostProcessors(regularPostProcessors, beanFactory);
		}

		else {
    
    
			// Invoke factory processors registered with the context instance.
			invokeBeanFactoryPostProcessors(beanFactoryPostProcessors, beanFactory);
		}

		// Do not initialize FactoryBeans here: We need to leave all regular beans
		// uninitialized to let the bean factory post-processors apply to them!
		String[] postProcessorNames =
				beanFactory.getBeanNamesForType(BeanFactoryPostProcessor.class, true, false);

		// Separate between BeanFactoryPostProcessors that implement PriorityOrdered,
		// Ordered, and the rest.
		List<BeanFactoryPostProcessor> priorityOrderedPostProcessors = new ArrayList<>();
		List<String> orderedPostProcessorNames = new ArrayList<>();
		List<String> nonOrderedPostProcessorNames = new ArrayList<>();
		for (String ppName : postProcessorNames) {
    
    
			if (processedBeans.contains(ppName)) {
    
    
				// skip - already processed in first phase above
			}
			else if (beanFactory.isTypeMatch(ppName, PriorityOrdered.class)) {
    
    
				priorityOrderedPostProcessors.add(beanFactory.getBean(ppName, BeanFactoryPostProcessor.class));
			}
			else if (beanFactory.isTypeMatch(ppName, Ordered.class)) {
    
    
				orderedPostProcessorNames.add(ppName);
			}
			else {
    
    
				nonOrderedPostProcessorNames.add(ppName);
			}
		}

		// First, invoke the BeanFactoryPostProcessors that implement PriorityOrdered.
		sortPostProcessors(priorityOrderedPostProcessors, beanFactory);
		invokeBeanFactoryPostProcessors(priorityOrderedPostProcessors, beanFactory);

		// Next, invoke the BeanFactoryPostProcessors that implement Ordered.
		List<BeanFactoryPostProcessor> orderedPostProcessors = new ArrayList<>(orderedPostProcessorNames.size());
		for (String postProcessorName : orderedPostProcessorNames) {
    
    
			orderedPostProcessors.add(beanFactory.getBean(postProcessorName, BeanFactoryPostProcessor.class));
		}
		sortPostProcessors(orderedPostProcessors, beanFactory);
		invokeBeanFactoryPostProcessors(orderedPostProcessors, beanFactory);

		// Finally, invoke all other BeanFactoryPostProcessors.
		List<BeanFactoryPostProcessor> nonOrderedPostProcessors = new ArrayList<>(nonOrderedPostProcessorNames.size());
		for (String postProcessorName : nonOrderedPostProcessorNames) {
    
    
			nonOrderedPostProcessors.add(beanFactory.getBean(postProcessorName, BeanFactoryPostProcessor.class));
		}
		invokeBeanFactoryPostProcessors(nonOrderedPostProcessors, beanFactory);

		// Clear cached merged bean definitions since the post-processors might have
		// modified the original metadata, e.g. replacing placeholders in values...
		beanFactory.clearMetadataCache();
	}

9. Start to execute the automatic configuration logic (the configuration specified by the startup class, not the default configuration), you can search inward layer by layer through debug, and you will find that it will eventually be in the ConfigurationClassParser class, which is the parsing class for all configuration classes , all parsing logic is in parser.parse(candidates)

public void parse(Set<BeanDefinitionHolder> configCandidates) {
    
    
		for (BeanDefinitionHolder holder : configCandidates) {
    
    
			BeanDefinition bd = holder.getBeanDefinition();
			try {
    
    
                //是否是注解类
				if (bd instanceof AnnotatedBeanDefinition) {
    
    
					parse(((AnnotatedBeanDefinition) bd).getMetadata(), holder.getBeanName());
				}
				else if (bd instanceof AbstractBeanDefinition && ((AbstractBeanDefinition) bd).hasBeanClass()) {
    
    
					parse(((AbstractBeanDefinition) bd).getBeanClass(), holder.getBeanName());
				}
				else {
    
    
					parse(bd.getBeanClassName(), holder.getBeanName());
				}
			}
			catch (BeanDefinitionStoreException ex) {
    
    
				throw ex;
			}
			catch (Throwable ex) {
    
    
				throw new BeanDefinitionStoreException(
						"Failed to parse configuration class [" + bd.getBeanClassName() + "]", ex);
			}
		}
    	//执行配置类
		this.deferredImportSelectorHandler.process();
	}
-------------------
    	protected final void parse(AnnotationMetadata metadata, String beanName) throws IOException {
    
    
		processConfigurationClass(new ConfigurationClass(metadata, beanName));
	}
-------------------
    protected void processConfigurationClass(ConfigurationClass configClass) throws IOException {
    
    
		if (this.conditionEvaluator.shouldSkip(configClass.getMetadata(), ConfigurationPhase.PARSE_CONFIGURATION)) {
    
    
			return;
		}

		ConfigurationClass existingClass = this.configurationClasses.get(configClass);
		if (existingClass != null) {
    
    
			if (configClass.isImported()) {
    
    
				if (existingClass.isImported()) {
    
    
					existingClass.mergeImportedBy(configClass);
				}
				// Otherwise ignore new imported config class; existing non-imported class overrides it.
				return;
			}
			else {
    
    
				// Explicit bean definition found, probably replacing an import.
				// Let's remove the old one and go with the new one.
				this.configurationClasses.remove(configClass);
				this.knownSuperclasses.values().removeIf(configClass::equals);
			}
		}

		// Recursively process the configuration class and its superclass hierarchy.
		SourceClass sourceClass = asSourceClass(configClass);
		do {
    
    
            //循环处理bean,如果有父类,则处理父类,直至结束
			sourceClass = doProcessConfigurationClass(configClass, sourceClass);
		}
		while (sourceClass != null);

		this.configurationClasses.put(configClass, configClass);
	}

10. Continue to follow up the doProcessConfigurationClass method, which is the core logic that supports annotation configuration

/**
	 * Apply processing and build a complete {@link ConfigurationClass} by reading the
	 * annotations, members and methods from the source class. This method can be called
	 * multiple times as relevant sources are discovered.
	 * @param configClass the configuration class being build
	 * @param sourceClass a source class
	 * @return the superclass, or {@code null} if none found or previously processed
	 */
	@Nullable
	protected final SourceClass doProcessConfigurationClass(ConfigurationClass configClass, SourceClass sourceClass)
			throws IOException {
    
    

        //处理内部类逻辑,由于传来的参数是启动类,并不包含内部类,所以跳过
		if (configClass.getMetadata().isAnnotated(Component.class.getName())) {
    
    
			// Recursively process any member (nested) classes first
			processMemberClasses(configClass, sourceClass);
		}

		// Process any @PropertySource annotations
        //针对属性配置的解析
		for (AnnotationAttributes propertySource : AnnotationConfigUtils.attributesForRepeatable(
				sourceClass.getMetadata(), PropertySources.class,
				org.springframework.context.annotation.PropertySource.class)) {
    
    
			if (this.environment instanceof ConfigurableEnvironment) {
    
    
				processPropertySource(propertySource);
			}
			else {
    
    
				logger.info("Ignoring @PropertySource annotation on [" + sourceClass.getMetadata().getClassName() +
						"]. Reason: Environment must implement ConfigurableEnvironment");
			}
		}

		// Process any @ComponentScan annotations
        // 这里是根据启动类@ComponentScan注解来扫描项目中的bean
		Set<AnnotationAttributes> componentScans = AnnotationConfigUtils.attributesForRepeatable(
				sourceClass.getMetadata(), ComponentScans.class, ComponentScan.class);
		if (!componentScans.isEmpty() &&
				!this.conditionEvaluator.shouldSkip(sourceClass.getMetadata(), ConfigurationPhase.REGISTER_BEAN)) {
    
    
            
			for (AnnotationAttributes componentScan : componentScans) {
    
    
				// The config class is annotated with @ComponentScan -> perform the scan immediately
                //遍历项目中的bean,如果是注解定义的bean,则进一步解析
				Set<BeanDefinitionHolder> scannedBeanDefinitions =
						this.componentScanParser.parse(componentScan, sourceClass.getMetadata().getClassName());
				// Check the set of scanned definitions for any further config classes and parse recursively if needed
				for (BeanDefinitionHolder holder : scannedBeanDefinitions) {
    
    
					BeanDefinition bdCand = holder.getBeanDefinition().getOriginatingBeanDefinition();
					if (bdCand == null) {
    
    
						bdCand = holder.getBeanDefinition();
					}
					if (ConfigurationClassUtils.checkConfigurationClassCandidate(bdCand, this.metadataReaderFactory)) {
    
    
                        //递归解析,所有的bean,如果有注解,会进一步解析注解中包含的bean
						parse(bdCand.getBeanClassName(), holder.getBeanName());
					}
				}
			}
		}

		// Process any @Import annotations
        //递归解析,获取导入的配置类,很多情况下,导入的配置类中会同样包含导入类注解
		processImports(configClass, sourceClass, getImports(sourceClass), true);

		// Process any @ImportResource annotations
        //解析@ImportResource配置类
		AnnotationAttributes importResource =
				AnnotationConfigUtils.attributesFor(sourceClass.getMetadata(), ImportResource.class);
		if (importResource != null) {
    
    
			String[] resources = importResource.getStringArray("locations");
			Class<? extends BeanDefinitionReader> readerClass = importResource.getClass("reader");
			for (String resource : resources) {
    
    
				String resolvedResource = this.environment.resolveRequiredPlaceholders(resource);
				configClass.addImportedResource(resolvedResource, readerClass);
			}
		}

		// Process individual @Bean methods
        //处理@Bean注解修饰的类
		Set<MethodMetadata> beanMethods = retrieveBeanMethodMetadata(sourceClass);
		for (MethodMetadata methodMetadata : beanMethods) {
    
    
			configClass.addBeanMethod(new BeanMethod(methodMetadata, configClass));
		}

		// Process default methods on interfaces
        // 处理接口中的默认方法
		processInterfaces(configClass, sourceClass);

		// Process superclass, if any
        //如果该类有父类,则继续返回,上层方法判断不为空,则继续递归执行
		if (sourceClass.getMetadata().hasSuperClass()) {
    
    
			String superclass = sourceClass.getMetadata().getSuperClassName();
			if (superclass != null && !superclass.startsWith("java") &&
					!this.knownSuperclasses.containsKey(superclass)) {
    
    
				this.knownSuperclasses.put(superclass, configClass);
				// Superclass found, return its annotation metadata and recurse
				return sourceClass.getSuperClass();
			}
		}

		// No superclass -> processing is complete
		return null;
	}

11. View the logic of obtaining configuration classes

processImports(configClass, sourceClass, getImports(sourceClass), true);

	/**
	 * Returns {@code @Import} class, considering all meta-annotations.
	 */
	private Set<SourceClass> getImports(SourceClass sourceClass) throws IOException {
    
    
		Set<SourceClass> imports = new LinkedHashSet<>();
		Set<SourceClass> visited = new LinkedHashSet<>();
		collectImports(sourceClass, imports, visited);
		return imports;
	}
------------------
    	/**
	 * Recursively collect all declared {@code @Import} values. Unlike most
	 * meta-annotations it is valid to have several {@code @Import}s declared with
	 * different values; the usual process of returning values from the first
	 * meta-annotation on a class is not sufficient.
	 * <p>For example, it is common for a {@code @Configuration} class to declare direct
	 * {@code @Import}s in addition to meta-imports originating from an {@code @Enable}
	 * annotation.
	 * 看到所有的bean都以导入的方式被加载进去
	 */
	private void collectImports(SourceClass sourceClass, Set<SourceClass> imports, Set<SourceClass> visited)
			throws IOException {
    
    

		if (visited.add(sourceClass)) {
    
    
			for (SourceClass annotation : sourceClass.getAnnotations()) {
    
    
				String annName = annotation.getMetadata().getClassName();
				if (!annName.equals(Import.class.getName())) {
    
    
					collectImports(annotation, imports, visited);
				}
			}
			imports.addAll(sourceClass.getAnnotationAttributes(Import.class.getName(), "value"));
		}
	}

12. Go back to the last line in the parse method in ConfigurationClassParser, and continue to follow up the method:

this.deferredImportSelectorHandler.process()
-------------
public void process() {
    
    
			List<DeferredImportSelectorHolder> deferredImports = this.deferredImportSelectors;
			this.deferredImportSelectors = null;
			try {
    
    
				if (deferredImports != null) {
    
    
					DeferredImportSelectorGroupingHandler handler = new DeferredImportSelectorGroupingHandler();
					deferredImports.sort(DEFERRED_IMPORT_COMPARATOR);
					deferredImports.forEach(handler::register);
					handler.processGroupImports();
				}
			}
			finally {
    
    
				this.deferredImportSelectors = new ArrayList<>();
			}
		}
---------------
  public void processGroupImports() {
    
    
			for (DeferredImportSelectorGrouping grouping : this.groupings.values()) {
    
    
				grouping.getImports().forEach(entry -> {
    
    
					ConfigurationClass configurationClass = this.configurationClasses.get(
							entry.getMetadata());
					try {
    
    
						processImports(configurationClass, asSourceClass(configurationClass),
								asSourceClasses(entry.getImportClassName()), false);
					}
					catch (BeanDefinitionStoreException ex) {
    
    
						throw ex;
					}
					catch (Throwable ex) {
    
    
						throw new BeanDefinitionStoreException(
								"Failed to process import candidates for configuration class [" +
										configurationClass.getMetadata().getClassName() + "]", ex);
					}
				});
			}
		}
------------
    /**
		 * Return the imports defined by the group.
		 * @return each import with its associated configuration class
		 */
		public Iterable<Group.Entry> getImports() {
    
    
			for (DeferredImportSelectorHolder deferredImport : this.deferredImports) {
    
    
				this.group.process(deferredImport.getConfigurationClass().getMetadata(),
						deferredImport.getImportSelector());
			}
			return this.group.selectImports();
		}
	}
------------
    public DeferredImportSelector getImportSelector() {
    
    
			return this.importSelector;
		}
------------
    @Override
		public void process(AnnotationMetadata annotationMetadata, DeferredImportSelector deferredImportSelector) {
    
    
			Assert.state(deferredImportSelector instanceof AutoConfigurationImportSelector,
					() -> String.format("Only %s implementations are supported, got %s",
							AutoConfigurationImportSelector.class.getSimpleName(),
							deferredImportSelector.getClass().getName()));
			AutoConfigurationEntry autoConfigurationEntry = ((AutoConfigurationImportSelector) deferredImportSelector)
					.getAutoConfigurationEntry(getAutoConfigurationMetadata(), annotationMetadata);
			this.autoConfigurationEntries.add(autoConfigurationEntry);
			for (String importClassName : autoConfigurationEntry.getConfigurations()) {
    
    
				this.entries.putIfAbsent(importClassName, annotationMetadata);
			}
		}

How to understand the starter in springboot?

​ When using the spring+springmvc framework for development, if you need to introduce the mybatis framework, you need to define the required bean objects in xml. This process is obviously very troublesome. If you need to introduce additional other components, it also needs to be complicated. The configuration, so the starter is introduced in springboot

​ The starter is a jar package, write a @Configuration configuration class, define these beans in it, and then write the configuration class in the META-INF/spring.factories of the starter package, then the springboot program will follow when it starts Convention to load the configuration class

​ Developers only need to rely on the corresponding starter package into the application and configure related attributes to develop code without having to configure bean objects separately

What is an Embedded Server and Why Use an Embedded Server?

​ In the springboot framework, you should find that there is an embedded tomcat. In the previous development process, every time you write the code, you must deploy the project to an additional web server. Only in this way can it run. This It is obviously a lot of trouble, but when using springboot, you will find that when starting the project, you can start the project directly in the way of a java application, without additional environmental support or tomcat server, because in the springboot framework The built-in tomcat.jar is used to start the container through the main method to achieve one-click development and deployment without any additional operations.

What are the advantages and disadvantages of mybatis?

1. Advantages of Mybait:

(1) Easy to learn, easy to use (compared to Hibernate) SQL-based programming;
(2) Compared with JDBC, it reduces the amount of code by more than 50%, eliminates a large number of redundant codes of JDBC, and does not need to manually switch connections;
( 3) Very good compatibility with various databases (because MyBatis uses JDBC to connect to the database, so as long as the JDBC-supported database MyBatis supports, and JDBC provides scalability, so as long as the database has a jar package for Java. Compatible with MyBatis), developers do not need to consider database differences.
(4) Provides many third-party plug-ins (paging plug-ins/reverse engineering);
(5) Can be well integrated with Spring;
(6) MyBatis is quite flexible and will not impose any impact on the existing design of the application or database. SQL is written in XML, completely separated from the program code, and the coupling between sql and program code is decoupled, which is convenient for unified management and optimization, and can be reused.
(7) Provide XML tags to support writing dynamic SQL statements.
(8) Provide mapping tags to support ORM field relationship mapping between objects and databases.
(9) Provide object-relational mapping tags to support object-relationship creation and maintenance.
2. Disadvantages of MyBatis framework:

(1) The workload of writing SQL statements is relatively large, especially when there are many fields and many associated tables, which requires developers to have certain skills in writing SQL statements.
(2) SQL statements depend on the database, resulting in poor database portability, and the database cannot be replaced at will.

What is the difference between mybatis and hibernate?

Advantages of Hibernate:

1. Hibernate is fully automatic. Hibernate can fully realize the operation of the database through the object-relational model, and has a complete mapping structure between JavaBean objects and the database to automatically generate sql.

2. Powerful functions, good database independence, strong O/R mapping ability, less code to write, and fast development speed.

3. There is a better secondary cache mechanism, and third-party caches can be used.

4. The database portability is good.

5. Hibernate has a complete log system. The hibernate log system is very sound and involves a wide range of issues, including sql records, relationship exceptions, optimization warnings, cache prompts, dirty data warnings, etc.

Disadvantages of Hibernate:

1. The threshold for learning is high, and the threshold for proficiency is higher. How programmers design O/R mapping, how to strike a balance between performance and object model, and how to make good use of Hibernate requires strong experience and ability.

2. Many hibernate sqls are automatically generated, and sql cannot be directly maintained; although there are hql queries, the functions are still not as powerful as sql. Although hibernate also supports native sql query, its development mode is different from orm, and it needs to change its thinking, so it is a little inconvenient to use. In short, hibernate is not as flexible as mybatis in writing sql.

Advantages of Mybatis:

1. It is easy to use and master. It provides the automatic object binding function of database query, and continues the good experience in using SQL. It is quite perfect for projects that do not have such high object model requirements.

2. SQL is written in xml, which is convenient for unified management and optimization, and decoupling of sql and program code.

3. Provide mapping tags, support the orm field relationship mapping between objects and databases

4. Provide object-relational mapping tags to support object-relationship creation and maintenance

5. Provide xml tags and support writing dynamic sql.

6. The speed is faster than that of Hibernate

Disadvantages of Mybatis:

1. When there are many associated tables and fields, the SQL workload is heavy.

2. SQL depends on the database, resulting in poor database portability.

3. Since the tag id in xml must be unique, the method in DAO does not support method overloading.

4. The object-relational mapping label and the field mapping label are only descriptions of the mapping relationship, and the specific implementation still depends on sql.

5. The DAO layer is too simple, and the workload of object assembly is relatively large.

6. Cascade update and cascade delete are not supported.

7. In addition to the basic recording function of the Mybatis log, other functions are much weaker.

8. When writing dynamic sql, it is inconvenient to debug, especially when the logic is complex.

9. The provided xml tag function for writing dynamic sql is simple, but writing dynamic sql is still limited and the readability is low.

What is the difference between #{} and ${} in mybatis?

a. #{} is precompilation processing, KaTeX parse error: Expected 'EOF', got '#' at position 24: .... b. When Mybatis processes #̲{}, it replaces ${} with the value of the variable when #{... {} in sql.
d. Using #{} can effectively prevent SQL injection and improve system security

Briefly describe the operating principle and development process of the mybatis plug-in?

Mybatis only supports plug-ins for the four interfaces of ParameterHandler, ResultSetHandler, StatementHandler, and Executor. Mybatis uses jdk's dynamic proxy to generate proxy objects for the interfaces that need to be intercepted to implement interface method interception. Whenever the methods of these four interface objects are executed , it will enter the interception method, specifically the invoke method of InvocationHandler, which intercepts the methods you specify to be intercepted.

Write a plug-in: implement the Interceptor interface of Mybatis and rewrite the intercept method, then write annotations for the plug-in, specify which methods of which interface to intercept, and configure the written plug-in in the configuration file.

@Intercepts({
    
    @Signature(type = StatementHandler.class,method = "parameterize",args = Statement.class)})

Fundamentals of Indexing

1. Why is there an index?
In general application systems, the read-write ratio is about 10:1, and insert operations and general update operations seldom have performance problems. In the production environment, we encounter the most and are the most likely to fail The problem is still some complex query operations, so the optimization of query statements is obviously the top priority. Speaking of accelerated query, we have to mention the index.
2. What is an index?
An index is also called a "key" in MySQL, and it is a data structure used by the storage engine to quickly find records. Indexes are critical to good performance
, especially when the amount of data in the table is getting larger and larger, and the impact of indexes on performance becomes more and more important.
Index optimization should be the most effective means of query performance optimization. Indexes can easily improve query performance by orders of magnitude.
The index is equivalent to the phonetic table of the dictionary. If you want to look up a word, if you don't use the phonetic table, you need to look it up page by page from hundreds of pages.

3. The principle of indexing

The purpose of indexing is to improve query efficiency, which is the same as the table of contents we use to consult books: first locate the chapter, then locate a subsection under the chapter, and then find the number of pages. Similar examples include: looking up dictionaries, looking up train numbers, airplane flights, etc.

The essence is: by continuously narrowing the scope of the data to be acquired to filter out the final desired results, and at the same time turning random events into sequential events, that is to say, with this indexing mechanism, we can always use The same lookup method to lock data.

The same is true for the database, but it is obviously much more complicated, because not only is it facing equivalent queries, but also range queries (>, <, between, in), fuzzy queries (like), union queries (or) and so on. How should the database choose to deal with all the problems? Let's recall the example of the dictionary. Can we divide the data into segments and query them in segments? The easiest way is if there are 1000 pieces of data, 1 to 100 is divided into the first section, 101 to 200 is divided into the second section, and 201 to 300 is divided into the third section... In this way, to check the 250th piece of data, you only need to find the third section, all at once 90% of invalid data is removed. But if it is 10 million records, how many segments is better? According to the search tree model, its average complexity is lgN, which has good query performance. But here we ignore a key issue, the complexity model is considered based on the same operation cost each time. The implementation of the database is more complicated. On the one hand, the data is stored on the disk. On the other hand, in order to improve performance, part of the data can be read into the memory for calculation each time, because we know that the cost of accessing the disk is about 100,000 for accessing the memory. times, so a simple search tree is difficult to meet complex application scenarios.

4. Index data structure

MySQL主要用到两种结构:B+ Tree索引和Hash索引
Inodb存储引擎 默认是 B+Tree索引
Memory 存储引擎 默认 Hash索引;
MySQL中,只有Memory(Memory表只存在内存中,断电会消失,适用于临时表)存储引擎显示支持Hash索引,是Memory表的默认索引类型,尽管Memory表也可以使用B+Tree索引。Hash索引把数据以hash形式组织起来,因此当查找某一条记录的时候,速度非常快。但是因为hash结构,每个键只对应一个值,而且是散列的方式分布。所以它并不支持范围查找和排序等功能。
B+Tree是mysql使用最频繁的一个索引数据结构,是InnoDB和MyISAM存储引擎模式的索引类型。相对Hash索引,B+Tree在查找单条记录的速度比不上Hash索引,但是因为更适合排序等操作,所以它更受欢迎。毕竟不可能只对数据库进行单条记录的操作。
对比:
hash类型的索引:查询单条快,范围查询慢
btree类型的索引:b+树,层数越多,数据量指数级增长(我们就用它,因为innodb默认支持它)

mysql聚簇和非聚簇索引的区别是什么?

​ mysql的索引类型跟存储引擎是相关的,innodb存储引擎数据文件跟索引文件全部放在ibd文件中,而myisam的数据文件放在myd文件中,索引放在myi文件中,其实区分聚簇索引和非聚簇索引非常简单,只要判断数据跟索引是否存储在一起就可以了。

​ When the innodb storage engine inserts data, the data must be put together with the index. If there is a primary key, the primary key is used. If there is no primary key, the unique key is used. If there is no unique key, the 6-byte rowid is used, so it is bound to the data. Together is the clustered index, and in order to avoid redundant storage of data, the key values ​​​​of the clustered index are stored in the leaf nodes of other indexes, so there are both clustered indexes and non-clustered indexes in InnoDB, and myisam There are only nonclustered indexes in .

What are the mysql index structures, and what are their advantages and disadvantages?

The data structure of the index is related to the implementation of the specific storage engine. The indexes used more in MySQL include hash index and B+ tree index. The index of innodb is realized as B+ tree, and the memory storage engine is hash index.

The B+ tree is a balanced multi-fork tree. The height difference from the root node to each leaf node does not exceed 1, and there are pointer-related connections between the two nodes at the same level. The conventional search on the B+ tree, from the root node to the The search efficiency of leaf nodes is basically the same, and there will be no large fluctuations. In addition, during index-based sequential scanning, bidirectional pointers can also be used to quickly move left and right, which is very efficient. Because, B+ tree index is widely used in database, file system and other scenarios.

The hash index uses a certain hash algorithm to convert the key value into a new hash value. When searching, it does not need to search step by step from the root node to the leaf node like a B+ tree. It only needs one hash algorithm to locate immediately To the corresponding position, the speed is very fast.

If it is an equivalent query, then the hash index obviously has an absolute advantage, because it only needs to go through one algorithm to find the corresponding key value, provided that the key value is unique. If the key value is not unique, you need to find the location of the key first, and then scan backwards according to the linked list until you find the corresponding data

If it is a range query retrieval, the Haxu index is useless at this time, because the originally ordered key values ​​may become discontinuous after the hash algorithm, and there is no way to use the index to complete the range query search

There is no way to use the index to complete the sorting, and some fuzzy queries like this

The hash index also does not support the leftmost matching rule of the multi-column joint index

The keyword retrieval efficiency of the B+ tree index is relatively average, unlike the B-tree, which fluctuates greatly. In the case of a large number of duplicate key values, the efficiency of the hash index is also extremely low, so there is a hash collision problem.

What are the design principles of the index?

​ 在进行索引设计的时候,应该保证索引字段占用的空间越小越好,这只是一个大的方向,还有一些细节点需要注意下:

​ 1、适合索引的列是出现在where字句中的列,或者连接子句中指定的列

​ 2、基数较小的表,索引效果差,没必要创建索引

​ 3、在选择索引列的时候,越短越好,可以指定某些列的一部分,没必要用全部字段的值

​ 4、不要给表中的每一个字段都创建索引,并不是索引越多越好

​ 5、定义有外键的数据列一定要创建索引

​ 6、更新频繁的字段不要有索引

​ 7、创建索引的列不要过多,可以创建组合索引,但是组合索引的列的个数不建议太多

​ 8、大文本、大对象不要创建索引

mysql锁的类型有哪些?

基于锁的属性分类:共享锁、排他锁。

基于锁的粒度分类:行级锁(innodb )、表级锁( innodb 、myisam)、页级锁( innodb引擎)、记录锁、间隙锁、临键锁。

基于锁的状态分类:意向共享锁、意向排它锁。

共享锁(share lock): 共享锁又称读锁,简称 S 锁;当一个事务为数据加上读锁之后,其他事务只能对该数据加读锁,而不能对数据加写锁,直到所有的读锁释放之后其他事务才能对其进行加持写锁。共享锁的特性主要是为了支持并发的读取数据,读取数据的时候不支持修改,避免出现重复读的问题。

排他锁(exclusive lock):排他锁又称写锁,简称 X 锁;当一个事务为数据加上写锁时,其他请求将不能再为数据加任何锁,直到该锁释放之后,其他事务才能对数据进行加锁。排他锁的目的是在数据修改时候,不允许其他人同时修改,也不允许其他人读取,避免了出现脏数据和脏读的问题。

表锁(table lock):表锁是指上锁的时候锁住的是整个表,当下一个事务访问该表的时候,必须等前一个事务释放了锁才能进行对表进行访问;特点:粒度大,加锁简单,容易冲突;

Row lock: Row lock means that when locking, one or more rows of records in the table are locked. When other transactions access the same table, only the locked records cannot be accessed, and other records can be accessed normally. Features: The granularity is small, locking is more troublesome than table locks, and it is not easy to conflict, and the concurrency supported by table locks is higher

Record lock: Record lock is also a kind of row lock, but the scope of record lock is only a certain record in the table. Record lock means that the transaction locks only a certain record in the table after locking After the record lock is added, the data can avoid the repeated reading problem that the data is modified during the query, and also avoid the dirty reading problem that is read by other transactions before the modified transaction is committed

Page lock: Page-level lock is a kind of lock in MysQL whose locking granularity is between row-level lock and table-level lock. Table-level locks are fast, but have more conflicts, and row-level locks have fewer conflicts, but are slower. Therefore, a compromised page level is adopted to lock a group of adjacent records at a time. Features: The overhead and locking time are between table locks and row locks, and deadlocks will occur; the locking granularity is between table locks and row locks, and the concurrency is average.

Gap lock: It is a kind of row lock. After the transaction is locked, the gap lock locks a certain range of table records. When there is a gap between adjacent IDs of the table, a range will be formed, following the left open Right-close principle. Range query and the query does not hit the record, the query condition must hit the index, and the gap lock will only appear in the transaction level of REPEATABLE_READ (repeated read).

Next-Key lock: It is also a kind of row lock, and it is the default algorithm of INNODB's row lock. In summary, it is a combination of record lock and gap lock. Next-Key lock will query the records Lock, and at the same time, it will also lock all the gap spaces in the range query, and then it will also lock the next adjacent interval.

How to see mysql execution plan?

​ In enterprise application scenarios, in order to know how to optimize the execution of SQL statements, it is necessary to view the specific execution process of SQL statements to speed up the execution efficiency of SQL statements.

​ You can use the explain+SQL statement to simulate the optimizer to execute the SQL query statement, so as to know how mysql handles the sql statement.

​ Official website address: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html

1. Information contained in the execution plan

Column Meaning
id The SELECT identifier
select_type The SELECT type
table The table for the output row
partitions The matching partitions
type The join type
possible_keys The possible indexes to choose
key The index actually chosen
key_len The length of the chosen key
ref The columns compared to the index
rows Estimate of rows to be examined
filtered Percentage of rows filtered by table condition
extra Additional information

id

The serial number of the select query, including a set of numbers, indicating the order in which the select clause or the operation table is executed in the query

The id number is divided into three situations:

1. If the IDs are the same, the order of execution is from top to bottom

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal;

2. If the id is different, if it is a subquery, the serial number of the id will be incremented. The larger the id value, the higher the priority, and the earlier it will be executed

explain select * from emp e where e.deptno in (select d.deptno from dept d where d.dname = 'SALES');

​ 3. The same and different ids exist at the same time: the same can be considered as a group, which is executed sequentially from top to bottom. Among all groups, the larger the id value, the higher the priority, and the earlier the execution

explain select * from emp e join dept d on e.deptno = d.deptno join salgrade sg on e.sal between sg.losal and sg.hisal where e.deptno in (select d.deptno from dept d where d.dname = 'SALES');

select_type

It is mainly used to distinguish the type of query, whether it is a normal query, a joint query or a subquery

select_type Value Meaning
SIMPLE Simple SELECT (not using UNION or subqueries)
PRIMARY Outermost SELECT
UNION Second or later SELECT statement in a UNION
DEPENDENT UNION Second or later SELECT statement in a UNION, dependent on outer query
UNION RESULT Result of a UNION.
SUBQUERY First SELECT in subquery
DEPENDENT SUBQUERY First SELECT in subquery, dependent on outer query
DERIVED Derived table
UNCACHEABLE SUBQUERY A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query
UNCACHEABLE UNION The second or later select in a UNION that belongs to an uncacheable subquery (see UNCACHEABLE SUBQUERY)
--sample:简单的查询,不包含子查询和union
explain select * from emp;

--primary:查询中若包含任何复杂的子查询,最外层查询则被标记为Primary
explain select staname,ename supname from (select ename staname,mgr from emp) t join emp on t.mgr=emp.empno ;

--union:若第二个select出现在union之后,则被标记为union
explain select * from emp where deptno = 10 union select * from emp where sal >2000;

--dependent union:跟union类似,此处的depentent表示union或union all联合而成的结果会受外部表影响
explain select * from emp e where e.empno  in ( select empno from emp where deptno = 10 union select empno from emp where sal >2000)

--union result:从union表获取结果的select
explain select * from emp where deptno = 10 union select * from emp where sal >2000;

--subquery:在select或者where列表中包含子查询
explain select * from emp where sal > (select avg(sal) from emp) ;

--dependent subquery:subquery的子查询要受到外部表查询的影响
explain select * from emp e where e.deptno in (select distinct deptno from dept);

--DERIVED: from子句中出现的子查询,也叫做派生类,
explain select staname,ename supname from (select ename staname,mgr from emp) t join emp on t.mgr=emp.empno ;

--UNCACHEABLE SUBQUERY:表示使用子查询的结果不能被缓存
 explain select * from emp where empno = (select empno from emp where deptno=@@sort_buffer_size);
 
--uncacheable union:表示union的查询结果不能被缓存:sql语句未验证

table

Which table is being accessed by the corresponding row, table name or alias, it may be a temporary table or a union combined result
set

2. The table name is in the form of derivedN, which means that the derived table generated by the query with the id of N is used

​ 3. When there is a union result, the table name is in the form of union n1, n2, etc., and n1 and n2 represent the ids participating in the union

type

type shows the access type, which indicates how I access our data. The easiest thing to think about is a full table scan, directly traversing a table violently to find the required data, the efficiency is very low, and the access There are many types, in order of efficiency from best to worst:

system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

In general, it is necessary to ensure that the query reaches at least the range level, and it is best to reach the ref

--all:全表扫描,一般情况下出现这样的sql语句而且数据量比较大的话那么就需要进行优化。
explain select * from emp;

--index:全索引扫描这个比all的效率要好,主要有两种情况,一种是当前的查询时覆盖索引,即我们需要的数据在索引中就可以索取,或者是使用了索引进行排序,这样就避免数据的重排序
explain  select empno from emp;

--range:表示利用索引查询的时候限制了范围,在指定范围内进行查询,这样避免了index的全索引扫描,适用的操作符: =, <>, >, >=, <, <=, IS NULL, BETWEEN, LIKE, or IN() 
explain select * from emp where empno between 7000 and 7500;

--index_subquery:利用索引来关联子查询,不再扫描全表
explain select * from emp where emp.job in (select job from t_job);

--unique_subquery:该连接类型类似与index_subquery,使用的是唯一索引
 explain select * from emp e where e.deptno in (select distinct deptno from dept);
 
--index_merge:在查询过程中需要多个索引组合使用,没有模拟出来
explain select * from rental where rental_date like '2005-05-26 07:12:2%' and inventory_id=3926 and customer_id=321\G

--ref_or_null:对于某个字段即需要关联条件,也需要null值的情况下,查询优化器会选择这种访问方式
explain select * from emp e where  e.mgr is null or e.mgr=7369;

--ref:使用了非唯一性索引进行数据的查找
 create index idx_3 on emp(deptno);
 explain select * from emp e,dept d where e.deptno =d.deptno;

--eq_ref :使用唯一性索引进行数据查找
explain select * from emp,emp2 where emp.empno = emp2.empno;

--const:这个表至多有一个匹配行,
explain select * from emp where empno = 7369;
 
--system:表只有一行记录(等于系统表),这是const类型的特例,平时不会出现

possible_keys

​ 显示可能应用在这张表中的索引,一个或多个,查询涉及到的字段上若存在索引,则该索引将被列出,但不一定被查询实际使用

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

key

​ 实际使用的索引,如果为null,则没有使用索引,查询中若使用了覆盖索引,则该索引和查询的select字段重叠。

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

key_len

表示索引中使用的字节数,可以通过key_len计算查询中使用的索引长度,在不损失精度的情况下长度越短越好。

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

ref

显示索引的哪一列被使用了,如果可能的话,是一个常数

explain select * from emp,dept where emp.deptno = dept.deptno and emp.deptno = 10;

rows

根据表的统计信息及索引使用情况,大致估算出找出所需记录需要读取的行数,此参数很重要,直接反应的sql找了多少数据,在完成目的的情况下越少越好

explain select * from emp;

extra

包含额外的信息。

--using filesort:说明mysql无法利用索引进行排序,只能利用排序算法进行排序,会消耗额外的位置
explain select * from emp order by sal;

--using temporary:建立临时表来保存中间结果,查询完成之后把临时表删除
explain select ename,count(*) from emp where deptno = 10 group by ename;

--using index:这个表示当前的查询时覆盖索引的,直接从索引中读取数据,而不用访问数据表。如果同时出现using where 表名索引被用来执行索引键值的查找,如果没有,表面索引被用来读取数据,而不是真的查找
explain select deptno,count(*) from emp group by deptno limit 10;

--using where:使用where进行条件过滤
explain select * from t_user where id = 1;

--using join buffer:使用连接缓存,情况没有模拟出来

--impossible where:where语句的结果总是false
explain select * from emp where empno = 7469;

事务的基本特性是什么?

事务四大特征:原子性,一致性,隔离性和持久性。

  1. 原子性(Atomicity)
    一个原子事务要么完整执行,要么干脆不执行。这意味着,工作单元中的每项任务都必须正确执行。如果有任一任务执行失败,则整个工作单元或事务就会被终止。即此前对数据所作的任何修改都将被撤销。如果所有任务都被成功执行,事务就会被提交,即对数据所作的修改将会是永久性的。
  2. 一致性(Consistency)
    一致性代表了底层数据存储的完整性。它必须由事务系统和应用开发人员共同来保证。事务系统通过保证事务的原子性,隔离性和持久性来满足这一要求; 应用开发人员则需要保证数据库有适当的约束(主键,引用完整性等),并且工作单元中所实现的业务逻辑不会导致数据的不一致(即,数据预期所表达的现实业务情况不相一致)。例如,在一次转账过程中,从某一账户中扣除的金额必须与另一账户中存入的金额相等。支付宝账号100 你读到余额要取,有人向你转100 但是事物没提交(这时候你读到的余额应该是100,而不是200) 这种就是一致性
  3. Isolation Isolation
    means that transactions must be executed independently without interfering with other processes or transactions. In other words, the data accessed by a transaction or unit of work cannot be affected by other parts of the system until the execution of the transaction or unit of work is complete.
  4. Persistence (Durability)
    Persistence means that during the execution of a transaction, all changes made to the data must be saved to some kind of physical storage device before the transaction ends successfully. This can ensure that the modifications made will not be lost in the event of any system failure.

What are the isolation levels of MySQL?

MySQL defines four isolation levels, including some specific rules, which are used to limit which changes are visible inside and outside the transaction, and which changes are not visible. Low levels of isolation generally support higher concurrent processing and have lower overhead.
REPEATABLE READ can repeat
the default isolation level of the MySQL database. This level solves the problems caused by the READ UNCOMMITTED isolation level. It guarantees that multiple instances of the same transaction will "see the same" rows when they read the transaction concurrently. However, this will lead to another thorny problem "phantom reading". The InnoDB and Falcon storage engines solve the problem of phantom reading through the multi-version concurrency control mechanism.
READ COMMITTED
The default isolation level for most database systems (but not MySQL's) satisfies the earlier simple definition of isolation: when a transaction begins, it can only "see" changes made by committed transactions, Any data changes made by a transaction from the start to the commit are invisible unless committed. This isolation level also supports so-called "non-repeatable reads". This means that users run the same statement twice and see different results.
READ UNCOMMITTED read uncommitted content
At this isolation level, all transactions can "see" the execution results of uncommitted transactions. At this level, a lot of problems can arise unless the user really knows what they're doing and has a very good reason to choose to do it. This isolation level is rarely used in practical applications, because its performance does not need to be much better than other performances, and other levels have other advantages. Read uncommitted data, also known as "dirty read"
SERIALIZABLE Serializable
This level is the highest level of isolation. It solves phantom reads by enforcing the ordering of transactions so that they cannot conflict with each other. In short, SERIALIZABLE locks each row of data read. At this level, it may lead to a large number of timeout Timeout and lock competition Lock Contention phenomenon, this level is rarely used in practical applications, but if the user's application needs to force the reduction of concurrency for data stability, this isolation can also be selected class.

  1. dirty read

Dirty read means that a transaction reads data during the execution of an uncommitted transaction.
When the operation of a transaction is modifying data multiple times, and before the transaction is committed, another concurrent transaction reads the data, which will result in the read data not being the data after the final persistence. This data is Dirty read data.

  1. non-repeatable read

Non-repeatable read means that for a certain data in the database, multiple queries return different query results during the execution of a transaction, which means that the data is submitted and modified by other transactions during the execution of the transaction.
The difference between non-repeatable read and dirty read is that dirty read means that one transaction reads data in another unfinished transaction execution process, while non-repeatable read means that another transaction commits and modifies the current transaction during the execution of a transaction The data being read.

  1. phantom reading

Phantom reading is a phenomenon that occurs when transactions are not executed independently. For example, transaction T1 batch-modifies the data in a table with a column value of 1 to 2, but at this time, transaction T2 inserts a row into this table. The data whose column value is 1, and submit it. At this time, if transaction T1 checks the data that has just completed the operation, and finds that there is still a piece of data with a column value of 1 that has not been modified, and this piece of data was actually submitted and inserted by T2 just now, this is a phantom read.
Both phantom reading and non-repeatable reading read another transaction that has been committed (this is different from dirty reading). The difference is that non-repeatable reading queries the same data item, while phantom reading targets a batch The data as a whole (such as the number of data).

How to deal with slow queries in MySQL?

1. Turn on the slow query log to accurately locate which SQL statement has a problem

2. Analyze the sql statement to see if additional data is loaded. It may be that extra rows are queried and discarded. It may be that many columns that are not needed in the results are loaded, and the statement is analyzed and rewritten.

3. Analyze the execution plan of the statement, and then obtain the use of the index, and then modify the statement or modify the index so that the statement can hit the index as much as possible

4. If the optimization of the statement is no longer possible, you can consider whether the amount of data in the table is too large, and if so, you can divide the table horizontally or vertically.

ACID is guaranteed by what?

Atomicity is guaranteed by the undolog log, which records the log information that needs to be rolled back, and undoes the successfully executed sql when the transaction is rolled back

Consistency is guaranteed by the other three major features, and the program code must ensure business consistency

Isolation is guaranteed by MVCC

Persistence is guaranteed by redolog. When mysql modifies data, it will record a log data in redolog. Even if the data is not saved successfully, as long as the log is saved successfully, the data will still not be lost.

What is MVCC?

1、MVCC

​ MVCC, the full name is Multi-Version Concurrency Control, that is, multi-version concurrency control. MVCC is a method of concurrency control. Generally, in the database management system, concurrent access to the database is realized, and transaction memory is realized in the programming language.

	MVCC在MySQL InnoDB中的实现主要是为了提高数据库并发性能,用更好的方式去处理读写冲突,做到即使有读写冲突时,也能做到不加锁,非阻塞并发读。

2. Currently reading

​ Operations like select lock in share mode (shared lock), select for update; update, insert, delete (exclusive lock) are a kind of current read, why is it called current read? That is, it reads the latest version of the record. When reading, it must ensure that other concurrent transactions cannot modify the current record, and it will lock the read record.

3. Snapshot read (improve the concurrent query capability of the database)

Unlocked select operations are snapshot reads, that is, non-blocking reads without locks; the premise of snapshot reads is that the isolation level is not the serial level, and snapshot reads at the serial level will degenerate into current reads; the reason why snapshots appear In the case of reading, it is based on the consideration of improving concurrency performance. The implementation of snapshot reading is based on multi-version concurrency control, that is, MVCC. It can be considered that MVCC is a variant of row locking, but in many cases, it avoids locking operations and reduces overhead; since it is based on multiple versions, that is, the snapshot read may not necessarily read the latest version of the data, but may be the previous historical version

4. Current read, snapshot read, MVCC relationship

MVCC multi-version concurrency control refers to maintaining multiple versions of a data, so that there is no conflict in read and write operations. Snapshot read is a non-blocking read function of MySQL to implement MVCC. The specific implementation of the MVCC module in MySQL is realized by three implicit fields, undo log, and read view.

What problem does MVCC solve?

There are three database concurrency scenarios, namely:

1. Reading: There is no problem, and no concurrency control is required

2. Read and write: There are thread safety issues, which may cause transaction isolation issues, and may encounter dirty reads, phantom reads, and non-repeatable reads

3. Write: There are thread safety issues, and there may be problems with lost updates

​ MVCC is a lock-free concurrency control used to solve read-write conflicts, that is, assign a single-item growth timestamp to the transaction, save a version for each modification, the version is associated with the transaction timestamp, and the read-only transaction starts A snapshot of the previous database, so MVCC can solve the following problems for the database:

1. When reading and writing the database concurrently, it is possible to avoid blocking the writing operation during the reading operation, and the writing operation does not need to block the reading operation, which improves the performance of concurrent reading and writing of the database

2. Solve the transaction isolation problems such as dirty reads, phantom reads, and non-repeatable reads, but cannot solve the problem of lost updates

What is the principle of MVCC implementation?

The realization principle of mvcc mainly depends on the three hidden fields in the record, undolog and read view.

​Hidden fields

In addition to our custom fields, each row of records also has DB_TRX_ID, DB_ROLL_PTR, DB_ROW_ID and other fields implicitly defined by the database

​ DB_TRX_ID

​ 6 bytes, recently modified transaction id, record the transaction id that created this record or last modified this record

​ DB_ROLL_PTR

​ 7 bytes, rollback pointer, pointing to the previous version of this record, used to cooperate with undolog, pointing to the previous old version

​ DB_ROW_JD

​ 6 bytes, hidden primary key, if the data table does not have a primary key, then innodb will automatically generate a 6-byte row_id

​ The record is shown in the figure:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-XiEa49zi-1662355491060)(images/data case.png)]

​ In the above figure, DB_ROW_ID is the unique implicit primary key generated by the database for this row record by default, DB_TRX_ID is the transaction ID currently operating on this record, and DB_ROLL_PTR is a rollback pointer, which is used to cooperate with the undo log and point to the previous old version

undo log

​ undolog is called a rollback log, which means a convenient rollback log generated during insert, delete, and update operations

​ When performing an insert operation, the generated undolog is only needed when the transaction is rolled back, and can be discarded immediately after the transaction is committed

​ When performing update and delete operations, the generated undolog is not only needed when the transaction is rolled back, but also when the snapshot is read, so it cannot be deleted casually, only when the log is not involved in the snapshot read or transaction rollback. The corresponding logs will be uniformly cleared by the purge thread (when the data is updated and deleted, the deleted_bit of the old record is only set, and the obsolete record is not really deleted, because innodb has a special purge in order to save disk space thread to clear the record whose deleted_bit is true, if the deleted_id of a record is true, and DB_TRX_ID is visible relative to the read view of the purge thread, then this record can be cleared for sure)

Let 's take a look at the record chain generated by undolog

1. Assume that a transaction with transaction number 1 inserts a record into the table, then the state of the row data at this time is:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-oFHZSS9K-1662355491061)(images/1.png)]

2. Suppose there is a second transaction number 2 to modify the name of the record and change it to lisi

​ When transaction 2 modifies the row record data, the database will add an exclusive lock to the row

​ Then copy the row of data to the undolog as an old record, that is, there is a copy of the current row in the undolog

​ After the copy is complete, modify the line name to lisi, and modify the transaction id of the hidden field to the id of the current transaction 2, and the rollback pointer points to the copy record copied to the undolog

​ After the transaction is committed, the lock is released

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-fbFRtagP-1662355491061)(images/2.png)]

3. Assume that there is a third transaction number 3 that modifies the age of the record and changes it to 32

​ When transaction 3 modifies the row data, the database will add an exclusive lock to the row

​ Then copy the row of data to the undolog, as the old record, and find that the row record already has an undolog, then the latest old data is used as the header of the linked list, and inserted at the front of the undolog of the row record

​ Modify the age of the row to be 32 years old, and modify the transaction id of the hidden field to the id of the current transaction 3, and the rollback pointer points to the copy record of the undolog just copied

​ Transaction commit, release lock

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-OCKTX4Ii-1662355491061)(images/3.png)]

From the above series of diagrams, you can find that the modification of the same record by different transactions or the same transaction will cause the undolog of the record to generate a linear list of record versions, that is, a linked list, and the head of the undolog is the latest old record , the tail of the chain is the earliest old record.

Read View

​ If you understand the above process, then you need to further understand the concept of read view.

​ Read View is a read view produced when a transaction performs a snapshot read operation. At the moment when the transaction executes a snapshot read, a current snapshot of the data system will be generated, and the id of the current active transaction in the system will be recorded and maintained. The id value of the transaction is Increasing.

​ In fact, the biggest function of Read View is to make visibility judgments, that is to say, when a transaction is executing a snapshot read, create a Read View view of the record and use it as a condition to judge whether the current transaction can Which version of the data is seen, it is possible to read the latest data, or it is possible to read the data of a certain version in the undolog recorded in the current line

​ The visibility algorithm followed by Read View is mainly to take out the DB_TRX_ID (current transaction id) in the latest record of the data to be modified, and compare it with the IDs of other active transactions in the system. If DB_TRX_ID is compared with the attributes of Read View , does not meet the visibility, then use the DB_ROLL_PTR rollback pointer to take out the DB_TRX_ID in the undolog for comparison, that is, traverse the DB_TRX_ID in the linked list until you find the DB_TRX_ID that meets the conditions. The old record where the DB_TRX_ID is located is the latest that the current transaction can see Old version data.

The visibility rules for Read View are as follows:

​ First of all, you need to know the three global attributes in Read View:

​ trx_list: A list of values, used to maintain the transaction ID (1,2,3) that the system is active at the time of Read View generation

​ up_limit_id: Record the ID with the smallest transaction ID in the trx_list list (1)

​ low_limit_id: The next transaction ID that has not been allocated by the system when Read View is generated, (4)

The specific comparison rules are as follows:

​ 1. First compare DB_TRX_ID < up_limit_id. If it is less than, the current transaction can see the record where DB_TRX_ID is located. If it is greater than or equal to enter the next judgment

2. Next, judge that DB_TRX_ID >= low_limit_id. If it is greater than or equal to, it means that the record where DB_TRX_ID is located appears only after Read View is generated, so it is definitely not visible to the current transaction. If it is less than, then enter the next step of judgment

​ 3. Determine whether DB_TRX_ID is in an active transaction. If it is, it means that the transaction is still active at the time when Read View is generated. There is no commit, and the modified data cannot be seen in the current transaction. If it is not, it means this transaction The commit has been started before the Read View is generated, so the modified results can be seen.

7. The overall processing flow of MVCC

Suppose there are four transactions executing at the same time, as shown in the figure below:

transaction 1 transaction 2 business 3 transaction 4
business start business start business start business start
Modified and submitted
in progress snapshot read in progress

From the above table, we can see that when transaction 2 performs a snapshot read on a row of data, the database generates a Read View view for the row of data, and we can see that transaction 1 and transaction 3 are still active, and transaction 4 is in the transaction 2 The update was submitted immediately before the snapshot read, so the currently active transactions 1 and 3 of the system are recorded in the Read View and maintained in a list. At the same time, you can see that the value of up_limit_id is 1, and the value of low_limit_id is 5, as shown in the following figure:

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-hLrGo8a7-1662355491062)(images/image-20210520143604440.png)]

In the above example, only transaction 4 has modified the row record and submitted the transaction before transaction 2 reads the snapshot, so the undolog of the current data in the row is as follows:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-m8w7oCbk-1662355491062)(images/image-20210520143717928.png)]

​ When transaction 2 reads the row record in the snapshot, it will take the DB_TRX_ID of the row record to compare with the up_limit_id, lower_limit_id and active transaction list, and interpret transaction 2 to see which version of the row record is.

​ The specific process is as follows: First, compare the transaction ID (4) recorded in this line with the up_limit_id in Read View to determine whether it is less than, and find that it is not less than the comparison, so it does not meet the conditions, continue to determine whether 4 is greater than or equal to low_limit_id, pass The comparison found that it is not greater than, so it does not meet the conditions. It is judged whether transaction 4 is processed in the trx_list list, and it is found that it is not in the list again, so the visibility condition is met, so the latest result submitted after transaction 4 is modified is visible to the snapshot of transaction 2. Therefore, the latest data record read by transaction 2 is the version submitted by transaction 4, and the version submitted by transaction 4 is also the latest version from the global perspective. As shown below:

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-i5sElp8N-1662355491062)(images/image-20210520143742317.png)]

When the above content is understood, then everyone should be able to figure out the relationship between these core concepts. Next, we will talk about the difference in snapshot reading under different isolation levels.

8. What is the difference between InnoDB snapshot reading at RC and RR levels?

​ Because the timing of Read View generation is different, the results of snapshot reading at the RC and RR levels are different

1. The first snapshot read of a certain record by a certain transaction at the RR level will create a snapshot, namely the Read View, to record other transactions currently active in the system, and then use it when calling the snapshot read The same Read View, so as long as the current transaction uses snapshot read before other transactions commit updates, then the subsequent snapshot reads use the same Read View, so subsequent modifications are not visible

2. At the RR level, when the snapshot is read to generate the Read View, the Read View will record the snapshots of all other activities and transactions at this time. The modification of these transactions is invisible to the current transaction, and the transactions created earlier than the Read View Changes made are visible

3. At the RC level, in a transaction, each snapshot read will generate a new snapshot and Read View, which is why we can see the updates submitted by other transactions in the transaction at the RC level.

​Summary : Under the RC isolation level, each snapshot read will generate and obtain the latest Read View, while under the RR isolation level, only the first snapshot read in the same transaction will create the Read View, and subsequent Snapshot reads are obtained from the same Read View.

What is mysql master-slave replication?

​ MySQL master-slave replication means that data can be replicated from a MySQL database server master node to one or more slave nodes. MySQL adopts the asynchronous replication method by default, so that the slave node does not have to keep accessing the master server to update its own data, the data update can be performed on a remote connection, and the slave node can replicate all databases in the master database or a specific database, or a specific table .

Why does mysql need master-slave synchronization?

1. In a complex business system, there is a situation where a SQL statement needs to lock the table, resulting in the temporary inability to use the read service, which will greatly affect the running business. Use master-slave replication and let the master database be responsible for writing. The slave library is responsible for reading. In this way, even if the main library locks the table, the normal operation of the business can be guaranteed by reading the slave library.

2. Do hot backup of data

3. The expansion of the structure. The business volume is increasing, and the I/O access frequency is too high for a single machine to satisfy. At this time, multi-library storage is used to reduce the frequency of disk I/O access and improve the I/O performance of a single machine.

What is the principle of mysql replication?

​ (1) The master server records the data change in the binary binlog log, and when the data on the master changes, it writes the change into the binary log;

​ (2) The slave server will detect whether the master binary log has changed within a certain time interval, and if it changes, it will start an I/OThread to request the master binary event

​ (3) At the same time, the master node starts a dump thread for each I/O thread, which is used to send binary events to it and save it to the local relay log of the slave node. The slave node will start the SQL thread to read from the relay log Take the binary log and replay it locally so that its data is consistent with that of the master node. Finally, I/OThread and SQLThread will go to sleep and wait for the next wake-up.

That is to say:

  • Two threads will be generated from the library, one I/O thread and one SQL thread;
  • The I/O thread will request the binlog of the main library, and write the obtained binlog to the local relay-log (relay log) file;
  • The main library will generate a log dump thread to pass binlog to the slave library I/O thread;
  • The SQL thread will read the logs in the relay log file and parse them into SQL statements to execute one by one;

Notice:

1–The master records the operation statement into the binlog log, and then grants the slave the permission to connect remotely (the master must enable the binlog binary log function; usually for data security considerations, the slave also enables the binlog function).
2–slave starts two threads: IO thread and SQL thread. Among them: the IO thread is responsible for reading the binlog content of the master into the relay log; the SQL thread is responsible for reading the binlog content from the relay log and updating it to the slave database, so that the slave data and the master data can be kept agreed.
3-Mysql replication requires at least two Mysql services. Of course, Mysql services can be distributed on different servers, or multiple services can be started on one server.
4–Mysql replication is best to ensure that the version of Mysql on the master and slave servers is the same (if the version cannot be satisfied, then ensure that the version of the master master node is lower than the version of the slave node) 5–the time between the master and slave nodes needs to be
synchronized

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-ZnKkBiT5-1662355491063)(images/master-slave principle.png)]

Specific steps:

1. The slave library connects to the master library by manually executing the change master to statement, provides all the conditions of the connected user (user, password, port, ip), and lets the slave library know the starting point of the binary log (file name, position number); start slave

2. Establish a connection between the IO thread of the slave library and the dump thread of the main library.

3. According to the file name and position number provided by the change master to statement, the IO thread initiates a binlog request to the master library.

4. The dump thread of the main library sends the local binlog to the IO thread of the slave library in the form of events according to the request of the slave library.

5. Receive binlog events from the library IO thread and store them in the local relay-log. The transmitted information will be recorded in master.info

6. Apply the relay-log from the database SQL thread, and record the applied records to relay-log.info. By default, the applied relay will be automatically purge

Briefly describe the difference between Myisam and Innodb?

InnoDB storage engine: Mainly for OLTP (Online Transaction Processing, online transaction processing) applications, it is the first storage engine that fully supports ACID transactions (BDB's first storage engine that supports transactions has stopped development).
Features:

1 Supports row locks
2 Supports foreign keys
3 Supports automatic addition of column AUTO_INCREMENT attributes
4 Supports transactions
5 Supports reading and writing in MVCC mode 6
The efficiency of reading is lower than that of MYISAM
7. The efficiency of writing is higher than that of MYISAM
8. It is suitable for frequent modification and designed to be safe 9. When clearing the entire table
, Innodb deletes row by row.

MyISAM storage engine: It is the official storage engine provided by MySQL, mainly for OLAP (Online Analytical Processing, Online Analytical Processing) applications.

Features:

1 Independent of the operating system, when a MyISAM storage engine table is created, three files will be created on the local disk. For example, if I create a tb_demo table, the following three files will be generated: tb_demo.frm, tb_demo.MYD, and tb_demo.MYI
2 does not support transactions,
3 supports table locks and full-text indexes
4 MyISAM storage engine tables are composed of MYD and MYI, MYD is used to store data files, and MYI is used to store index files. The MySQL database only caches its index files, and the cache of data files is left to the operating system itself;
5 Starting from MySQL 5.0, MyISAM supports 256T of single-table data by default;
6. Select intensive tables: the MYISAM storage engine is filtering a large number of The data is very fast, which is its most prominent advantage
7. The efficiency of reading is better than InnoDB
8. The efficiency of writing is lower than that of InnoDB
9. It is suitable for query and insert-based applications
10. When the entire table is cleared, MYISAM will create a new one surface

Briefly describe the types of indexes in mysql and their impact on database performance?

Ordinary index: Allows the indexed data column to contain duplicate values

Unique index: can guarantee the uniqueness of data records

Primary key index: It is a special unique index. Only one primary key index can be defined in a table. The primary key is used to uniquely identify a record. It is created using the keyword primary key

Joint index: the index can cover multiple data columns

Full-text index: By establishing an inverted index, the retrieval efficiency can be greatly improved, and the problem of judging whether a field is included is a key technology currently used by search engines

Indexes can greatly improve the query speed of data

By using the index, you can use the optimization hider during the query process to improve the performance of the system

But it will reduce the speed of inserting, deleting, and updating tables, because when performing these write operations, the index file must also be operated

Indexes need to occupy physical space. In addition to the data space occupied by the data table, each index also occupies a certain amount of physical space. If you want to resume a clustered index, the space required will be larger. If there are many non-clustered indexes, once If the clustered index changes, all non-clustered indexes will change accordingly.

What is bytecode?

Because the JVM is customized for various operating systems and platforms, no matter what platform it is on, you can use the javac command to compile a .java file into a fixed-format bytecode (.class file) for use by the JVM. The reason why it is called bytecode is that **.class files are composed of hexadecimal values, and JVM reads them in bytes as a group of two hexadecimal values**
The format is as follows
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-kUw2NtHG-1662355491063)(images/bytecode.png)]

What is the structure of bytecode?

The JVM has requirements for the specification of the bytecode. It is required that each bytecode file must be composed of ten parts in a fixed order, as shown in the figure below
: Save the picture and upload it directly (img-4f7itZ44-1662355491063)(images/bytecode2.png)]

  1. magic number

The first 4 bytes of all .class files are magic numbers. The magic number is a fixed value: 0xCAFEBABE, placed at the beginning of the file, and the JVM can judge whether the file may be a .class file based on the beginning of the file. , if it starts with this, the following operations will be performed later. The fixed value of this magic number is specified by James Gosling, the father of Java, which means CafeBabe (coffee baby)

  1. version number

The version number is 4 bytes after the magic, the first two bytes represent the minor version number (Minor Version), the last two bytes represent the major version number (Major Version), the above 0000 0032, the minor version number 0000 is converted to Decimal is 0, the main version number 0032 is converted to decimal 50, corresponding to the version mapping relationship in the figure below, you can see that the corresponding java version number is 1.6

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-BWXDThgf-1662355491063)(images/bytecodeversion.png)]

  1. constant pool

The byte immediately after the main version number is the constant pool entry. There are two types of constants in the constant pool: literals and symbolic references. Literals are constant values ​​declared as Final in the code, and symbolic references are global restrictions such as classes and interfaces. name, field names and descriptors, and method names and descriptors. The constant pool is divided into two parts as a whole: the constant pool counter and the constant pool data area
. changlangchi.png)]

  1. access sign

The two bytes after the end of the constant pool describe whether it is a class or an interface, and whether it is modified by modifiers such as Public, Abstract, and Final. The JVM specification stipulates 9 access flags (Access_Flag). JVM is described by a bitwise OR operation For all access marks, such as the modifier of the class is Public Final, the value of the corresponding access modifier is ACC_PUBLIC | ACC_FINAL, that is, 0x0001 | 0x0010=0x0011 [
External link image transfer failed, the source site may have an anti-leeching mechanism, It is recommended to save the picture and upload it directly (img-2L3mXkBx-1662355491064)(images/access_flag.png)]

  1. current class index

The two bytes after the access flag describe the fully qualified name of the current class. The value stored in these two bytes is the index value in the constant pool. According to the index value, the fully qualified name of this class can be found in the constant pool.
Name

  1. parent class index

The two bytes after the current class name, the fully qualified name of the described parent class, is also the index value in the saved constant pool

  1. interface index

The two bytes after the parent class name are interface counters, which describe the number of interfaces implemented by the class or the parent class, and the next n bytes are the index values ​​of the string constants of all interface names

  1. field table

Used to describe variables declared in classes and interfaces, including class-level variables and instance variables, but not including local variables declared inside methods. The field table is also divided into two parts. The first part is two bytes, describing the field Number, the second part is the detailed information of each field fields_info
[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-DPervFv5-1662355491065)(images/field.png) ]

  1. method table

After the field table is the method table, the method table is also divided into two parts, the first part is two bytes to express the number of methods, and the second part is the detailed information of each method. The access information of the method is more complicated,
including Method access flag, method name, method descriptor, and method attributes:
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-RPW3Acwg-1662355491065)(images/ method.png)]

  1. Additional attributes

The last part of the bytecode, this item stores the basic information of the properties defined by the class or interface in this file.

What is the class initialization process?

First, the class loading mechanism process is divided into 5 parts: loading, verification, preparation, parsing, and initialization
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-ZiEZ1NrE-1662355491066 )(images/class-int.png)]

We now mainly analyze the initialization process of the class:

  1. The initialization phase of the class is the process of actually starting to execute the java program code (bytecode) defined in the class and initializing the class variables according to the programmer's intention. More directly, the initialization phase is the process of executing the class constructor () method. The () method is generated by the combination of the assignment action of all class variables in the class automatically collected by the compiler and the statements in the static code block static{}, where the order of collection by the compiler is determined by the order in which the statements appear in the source file .
  2. About the order of class initialization ** (static variables, static initialization blocks: depending on the order in which they appear in the class) > (variables, initialization blocks: depending on the order in which they appear in the class) > Constructor **
  3. For the detailed process of class initialization, refer to the Java Virtual Machine Specification, where the class initialization process is as follows:
    1. Each class has an initialization lock LC, and the process acquires the LC. This operation will cause the current thread to wait until the LC lock is acquired.
    2. If C is being initialized by other threads, the current thread will release LC to enter the blocking state, and wait for C initialization to complete. At this point the current thread needs to retry the process. The interrupt status of the thread is not affected while the initialization process is being executed
    3. If C is being initialized by this thread, that is, recursively initialized, release LC and return normally
    4. If C has been initialized, release LC and return normally
    5. If C is in an error state, indicating that it is no longer possible to complete initialization, free LC and raise the exception NoClassDefFoundError
    6. Otherwise, mark C as being initialized by this thread and release LC; then, initialize those class member variables that are final and of basic type
    7. If C is a class rather than an interface, and C's parent class Super Class (SC) and each interface SI_n (according to the order in the implements clause) have not been initialized, then recursively perform a complete initialization process on SC, if Necessary, SC needs to be verified and prepared first; if an exception is thrown during initialization of SC or SIn, then acquire LC, mark C as error state, and notify all waiting threads, then release LC, and then throw the same abnormal.
    8. Obtain whether the assertion assertion mechanism is turned on from the classloader of C
    9. Next, execute class variable initialization and static code blocks, or interface field initializations, in textual order, treating them as individual code blocks.
    10. If the execution is normal, then acquire the LC, mark the C object as initialized, and notify all waiting threads, then release the LC, and exit the whole process normally
    11. Otherwise, if an exception E is thrown then the exit will be aborted. If E is not Error, create a new exception ExceptionInInitializerError with E as parameter. If ExceptionInInitializerError cannot be created because of OutOfMemoryError, then OutOfMemoryError as E.
    12. Acquires LC, marks C in error state, notifies all waiting threads, releases LC, and throws exception E.

It can be seen that JLS does stipulate that the parent class is initialized first, and the assignment of static blocks and class variables is done in text order.

How is the JVM memory model allocated?

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-Pkq1JL6U-1662355491066)(images/javammode.png)]

What are the principles of JVM performance tuning?

  1. Most Java applications do not need GC optimization on the server. There are many optimizations inside the virtual machine to ensure the stable operation of the application, so don't tune for the sake of tuning. Improper tuning may backfire
  2. Before the application goes online, consider setting the JVM parameters of the machine to the optimal (suitable)
  3. Before GC optimization, you need to confirm that there is no room for optimization in the project's architecture and code. We cannot expect an application with a flawed system architecture or endless code-level optimization to achieve a qualitative leap in performance through GC optimization
  4. GC optimization is a systematic and complex work, and there is no universal tuning strategy that can satisfy all performance indicators. GC optimization must be based on our in-depth understanding of various garbage collectors in order to achieve twice the result with half the effort
  5. When dealing with throughput and latency issues, the larger the memory that the garbage processor can use, that is, the larger the java heap space, the better the garbage collection effect and the smoother the application will run. This is called the GC memory maximization principle
  6. Select two of these three attributes (throughput, latency, memory) for jvm tuning, which is called GC tuning 3 out of 2

When is JVM tuning needed?

  • Heap memory (old generation) continues to rise and reaches the set maximum memory value
  • Frequent Full GC
  • GC pause (Stop World) time is too long (more than 1 second, the specific value depends on the application scenario)
  • Memory exceptions such as OutOfMemory occur in the application
  • The application has memory exceptions such as OutOfDirectMemoryError (failed to allocate 16777216 byte(s) of direct memory (used: 1056964615, max: 1073741824))
  • The application uses a local cache and takes up a lot of memory space
  • System throughput and response performance is not high or decline
  • The CPU usage of the application is too high or the memory usage is too high

What indicators do you pay attention to when tuning the JVM?

  1. **Throughput: **User Code Time / (User Code Execution Time + Garbage Collection Time). It is one of the important indicators to evaluate the garbage collector's ability. It is the highest performance indicator that the garbage collector can support the application without considering the pause time or memory consumption caused by garbage collection. The higher the throughput the better the algorithm.
  2. **Low Latency:** The shorter the STW, the better the response time. An important index to evaluate the garbage collector's ability. The metric is to shorten the pause time caused by garbage collection or completely eliminate the pause caused by garbage collection, so as to avoid jitter when the application is running. The shorter the pause time, the better the algorithm
  3. When designing (or using) a GC algorithm, we must determine our goals: a GC algorithm may only aim at one of two goals (i.e. only focus on maximum throughput or minimum pause time), or try to find a compromise between the two
  4. MinorGC collects as many garbage objects as possible. We call this the MinorGC principle, and compliance with this principle can reduce the frequency of FullGC in the application. FullGC is time-consuming and is the culprit for applications not meeting latency requirements or throughput
  5. Starting point and analysis point of heap size adjustment:
    1. Statistics Minor GC duration
    2. Count the number of Minor GC
    3. Statistics of the longest duration of Full GC
    4. Statistical worst case Full GC frequency
    5. Statistical GC duration and frequency is the main starting point for optimizing the size of the heap
    6. According to the delay and throughput requirements of the business system, we can adjust the size of each area according to these analyzes
  6. Generally speaking, the throughput-first garbage collector: -XX:+UseParallelGC -XX:+UseParallelOldGC, that is, conventional (PS/PO)
  7. Response Time Prioritized Garbage Collectors: CMS, G1

What are the common parameters of JVM?

  1. Xms refers to setting the memory size occupied by the program when it starts. Generally speaking, if it is larger, the program will start faster, but it may also cause the machine to slow down temporarily
  2. Xmx refers to setting the maximum memory size that can be occupied during the running of the program. If the program needs to take up more memory than this setting value, an OutOfMemory exception will be thrown
  3. Xss refers to setting the stack size of each thread. This depends on your program, how much memory a thread needs to take up, how many threads may be running at the same time, etc.
  4. **-Xmn、-XX:NewSize/-XX:MaxNewSize、-XX:NewRatio **
    1. High priority: -XX:NewSize/-XX:MaxNewSize
    2. Medium priority: -Xmn (equivalent to -Xmn=-XX:NewSize=-XX:MaxNewSize=? by default)
    3. Low priority: -XX:NewRatio
  5. If you want to track class loading and class unloading in the log, you can use the startup parameters **-XX:TraceClassLoading -XX:TraceClassUnloading**

What are the commonly used performance tuning tools for JVM?

  1. MAT

    1. A point that hints at a possible memory leak
  2. jvisualvm

  3. jconsole

  4. Arthas

  5. show-busy-java-threads

    1. https://github.com/oldratlee/useful-scripts/blob/master/docs/java.md#-show-busy-java-threads

What is the general process for online troubleshooting?

  1. Excessive CPU usage troubleshooting process
    1. Use the top command to find out the pid of the process with the highest CPU, if the pid is 9876
    2. Then check the thread id with the highest occupation under the process [top -Hp 9876]
    3. Assuming that the thread ID with the highest occupancy rate is 6900, convert it to hexadecimal form (because java native thread is output in hexadecimal form) [printf '%x\n' 6900]
    4. Use jstack to print out the java thread call stack information [jstack 9876 | grep '0x1af4' -A 50 --color], so that the problem can be better located
  2. Excessive memory usage troubleshooting process
    1. Find the process id: [top -d 2 -c]
    2. Check the JVM heap memory allocation: jmap -heap pid
    3. View objects that occupy a lot of memory jmap -histo pid | head -n 100
    4. View the surviving objects that occupy a lot of memory jmap -histo:live pid | head -n 100

Under what circumstances will OOM be thrown?

  • 98% of JVM time is spent on memory recycling
  • The memory recovered each time is less than 2%

Satisfying these two conditions will trigger OutOfMemoryException, which will leave a small gap for the system to do some operations before Down, such as manually printing Heap Dump. It is not
thrown

What are the phenomena before the system OOM?

  • The time of each garbage collection is getting longer and longer, from the previous 10ms to about 50ms, and the time of FullGC has also been extended from the previous 0.5s to 4, 5s
  • The number of FullGC is increasing, and the most frequent FullGC is less than 1 minute.
  • The memory of the old generation is getting bigger and bigger and after each FullGC, only a small amount of memory in the old generation is released

How to analyze the heap dump file?

You can automatically export the dump
file

How to do GC log analysis?

In order to facilitate the analysis of GC log information, you can specify the startup parameters [-Xloggc: app-gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps] to view GC log information in detail

  1. Use [jinfo pid] to view the relevant parameters of the current JVM heap
  2. Continue to use [jstat -gcutil 2315 1s 10] to view the current heap occupancy within 10s
  3. You can also use [jmap -heap pid] to view the current JVM heap situation
  4. We can continue to use [jmap -F -histo pid | head -n 20] to view the first 20 lines of printing, that is, to view the current top20 large objects. Generally, we can find some abnormal large objects from here. If not, we can continue to rank Top 50 Large Objects, Analysis
  5. Finally, use [jmap -F -dump:file=a.bin pid], if the dump file is very large, you can compress it [tar -czvf a.tar.gz a.bin]
  6. After that, the dump file is analyzed, using MAT to analyze memory leaks
  7. Reference case: https://www.lagou.com/lgeduarticle/142372.html

How to troubleshoot online deadlocks?

  1. jps looks for a potentially problematic process id
  2. Then execute [jstack -F process id ]
  3. If the environment allows remote connection to the JVM, you can use jconsole or jvisualvm to check whether there is a deadlock with a graphical interface

What are the optimization solutions for online YGC that takes too long?

  1. If there are more and more objects with a long life cycle (such as global variables or static variables, etc.), the time-consuming process of labeling and copying will increase
  2. It takes too long to mark the surviving objects: For example, the Finalize method of the Object class is overloaded, which causes the time-consuming to mark the Final Reference; or the String.intern method is used improperly, which causes the YGC to scan the StringTable for too long. The time-consuming GC processing Reference can be displayed by the following parameters -XX:+PrintReferenceGC
  3. Excessive accumulation of long-period objects: For example, improper use of local caches and the accumulation of too many surviving objects; or serious lock competition leads to thread blocking, and the life cycle of local variables becomes longer
  4. Case reference: https://my.oschina.net/lishangzhi/blog/4703942

What are the online frequent FullGC optimization solutions?

  1. Frequent online FullGC generally has the following characteristics:
    1. The CPU of multiple threads on the line exceeds 100%. Through the jstack command, we can see that these threads are mainly garbage collection threads.
    2. Monitor the GC situation through the jstat command, you can see that the number of Full GC is very large, and the number is increasing
  2. Troubleshooting process:
    1. top finds the process
    2. Then [top -Hp process id], find the thread
    3. [printf "%x\n" thread id] , assuming the hexadecimal result is a
    4. jstack thread id | grep '0xa' -A 50 --color
    5. If it is a normal user thread, check the thread's stack information to see where the user code is running, which consumes more CPU
    6. If the thread is VMThread, monitor the GC status of the current system through the jstat-gcutil command, and then export the current memory data of the system through jmapdump:format=b,file=. After exporting, put the memory situation into the mat tool of eclipse for analysis, and you can find out what objects in the memory consume more memory, and then you can process related codes; under normal circumstances, you will find that VM Thread refers to the thread of garbage collection
    7. Then execute [jstat -gcutil **process id], ** see the result, if the number of FGC is high and growing, then it can be located that the system is slow due to frequent FullGC due to memory overflow
    8. Then you can Dump the memory log, and then use the MAT tool to analyze which objects occupy a large amount of memory, and then find the location where the object was created, and process it
  3. Reference case: https://mp.weixin.qq.com/s/g8KJhOtiBHWb6wNFrCcLVg

How to analyze online off-heap memory leaks? (Netty is especially common)

  1. The location of the JVM's off-heap memory leak has always been a difficult problem
  2. Leak analysis of external memory is generally derived from the process of in-heap memory analysis. It is possible that we found in the process of analyzing the memory leak in the heap that the JVM heap memory we calculated was actually larger than the Xmx size of the entire JVM, which means that the extra heap memory is extra heap memory
  3. If you use Netty off-heap memory, you can monitor the usage of off-heap memory by yourself without using third-party tools. We use "reflection" to get the off-heap memory
  4. Gradually narrow the scope until the bug is found. When we confirm that the execution of a certain thread brings bugs, we can perform single-step execution or binary execution. After locating a certain line of code, follow this code, and then continue single-step execution or binary execution to locate the final bug. code. This method has been tried and tested, and finally you can always find the bug you want
  5. Familiar with the debugging of ideas, let us "catch bugs" as fast as lightning (that's how "The Flash" came about). Here, the most common debugging method is to pre-execute expressions , and through the thread call stack , staring at an object, you can grasp the definition, assignment, etc. of this object
  6. In projects that use direct memory, it is best to configure -XX:MaxDirectMemorySize to set the maximum direct memory value that the system can actually reach. The default maximum direct memory size is equal to the value of -Xmx
  7. To troubleshoot out-of-heap leaks, it is recommended to specify startup parameters: -XX:NativeMemoryTracking=summary -Dio.netty.leakDetection.targetRecords=100-Dio.netty.leakDetection.level=PARANOID, the latter two parameters are the level of Netty related memory leak detection and sampling level
  8. Reference case: https://tech.meituan.com/2018/10/18/netty-direct-memory-screening.html

What are the optimization solutions for online metaspace memory leaks?

  1. One thing to note is that the Java8 and Java8+ JVMs have discarded the permanent generation and replaced it with the metaspace, and whether the metaspace is in the JVM heap or not, it belongs to off-heap memory and is limited by the maximum physical memory. The best practice is that we should set -XX:MetaspaceSize=1024m -XX:MaxMetaspaceSize=1024m in the startup parameters. The specific value is set according to the situation. In order to avoid dynamic application, you can directly set it to the maximum value
  2. The metaspace mainly stores class metadata, and metaspace judges whether the class metadata can be recycled based on whether the Classloader that loads these class metadata can be recycled. As long as the Classloader cannot be recycled, the class metadata loaded through it will not will be recycled. So sometimes there is a problem online, because in the framework, tools like ASM and javassist are often used to enhance bytecodes and generate proxy classes. If the main thread frequently generates dynamic proxy classes in the project, it will cause the metaspace to fill up rapidly and cannot be recycled
  3. For specific cases, please refer to: https://zhuanlan.zhihu.com/p/200802910

What are the java class loaders?

Bootstrap class loader

The startup class loader mainly loads the classes needed by the JVM itself. This class loader is implemented in C++ language and has no parent class. It is a part of the virtual machine itself. It is responsible for the core class library or * The jar package** under the path specified by the *-Xbootclasspath parameter is loaded into the memory. Note that the virtual machine must load the jar package according to the file name, such as rt.jar. If the file name is not recognized by the virtual machine, even if the jar It is useless to drop the package into the lib directory (for security reasons, the Bootstrap startup class loader only loads the package name starting with java, javax, sun, etc.

Extention class loader

The extended class loader refers to the sun.misc.Launcher$ExtClassLoader class implemented by Sun, which is implemented by the Java language . The parent class loader is null and is a static internal class of the Launcher. It is responsible for loading **<JAVA_HOME>/lib/ ext directory or the class library in the bit path specified by the system variable -Djava.ext.dir**, developers can directly use the standard extension class loader
[

](https://blog.csdn.net/javazejian/article/details/73413292)

Application class loader

The term application loader refers to sun.misc.Launcher$AppClassLoader implemented by Sun. The parent class loader is ExtClassLoader, which is responsible for loading the class library** under the path specified by the system class path java -classpath or **-D java.class.path, which is the classpath path we often use , developers can use it directly System class loader, under normal circumstances, this class loader is the default class loader in the program, and the class
loader

Custom custom class loader

The application can customize the class loader, and the parent class loader is AppClassLoader

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-OvU1xwZf-1662355491066)(images/classloader.png)]

What is the parent delegation mechanism?

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-Z73BLK5J-1662355491067)(images/classloader2.png)]

Parental delegation mechanism
The parental delegation mode was introduced after Java 1.2. Its working principle is that if a class loader receives a class loading request, it will not load it first, but delegate the request to the parent class . loader to execute , if the parent class loader still has its parent class loader, it will further delegate upwards, recursively in turn , and the request will eventually reach the top-level startup class loader , if the parent class loader can complete the class loading task, it will succeed Returning, if the parent class loader cannot complete the loading task, the child loader will try to load it by itself, which is the parent delegation mode.

Benefits of Parental Delegation

  • Each class will only be loaded once, avoiding repeated loading
  • Each class will be loaded as much as possible (from the bootstrap class loader down, each loader may try to load it according to priority)
  • Effectively avoid the loading of some malicious classes (such as customizing the Java.lang.Object class, generally speaking, under the parental delegation model, the system Object class will be loaded instead of the custom Object class)

In addition, you can talk more about how to destroy the parental delegation model

  1. The first "broken" of the parental delegation model is to rewrite the loadClass() of the custom loader, which is not recommended by jdk. Generally, just rewrite findClass(), so that the parental delegation mechanism can be maintained. The loading rules of the loadClass method are defined by yourself, and you can load classes as you like
  2. The second "break" of the parental delegation model is ServiceLoader and Thread.setContextClassLoader(). That is, the thread context class loader (contextClassLoader). The parental delegation model solves the unification problem of the basic classes of each class loader (the more basic classes are loaded by the higher-level loader), the reason why the basic classes are called "basic" is because they are always used as The API called by the calling code. But what if the base class calls the user's code? The thread context class loader appears.
    1. SPI. This class loader can be set through the setContextClassLoader() method of the java.lang.Thread class. If it has not been set when the thread is created, it will inherit one from the parent thread; if it has not been set in the global scope of the application , then this class loader is the application class loader by default. In addition to having a thread context class loader, the JNDI service uses this thread context class loader to load the required SPI code, that is, the parent class loader requests the child class loader to complete the class loading action. This behavior is actually connected. The hierarchical structure of the parental delegation model to use the class loader in reverse has violated the parental delegation model, but this is also a helpless thing. All loading actions involving SPI in Java basically use this method, such as JNDI, JDBC, JCE, JAXB, and JBI.
    2. The thread context class loader is AppClassLoader by default, so why not get the class loader directly through getSystemClassLoader() to load the classes under the classpath? In fact, it is feasible, but there is a disadvantage of directly using the getSystemClassLoader() method to obtain the AppClassLoader loading class, that is, there will be problems when the code is deployed to different services, such as deploying the code to Java Web application services or services such as EJB. There will be problems, because the thread context class loader used by these services is not AppClassLoader, but the class loader of the Java Web application server itself, and the class loader is different. , so our application should use getSystemClassLoader() less. In short, different services may use different default ClassLoaders, but using the thread context class loader can always obtain the same ClassLoader as the current program execution, thereby avoiding unnecessary problems
  3. The third "destruction" of the parental delegation model is due to the user's pursuit of program dynamics. The "dynamic" mentioned here refers to some very "popular" terms: code hot replacement, module hot deployment Wait, the short answer is that the machine does not need to be restarted, as long as it is deployed, it can be used.

How does GC judge that an object can be recycled?

  1. reference counting (obsolete algorithm)
    1. Each object has a reference attribute, which is incremented by one when a reference is added, decremented by one when the reference is released, and can be recycled when the count reaches 0.

But this calculation method has a fatal problem, which cannot solve the problem of circular references

  1. Reachability Analysis Algorithm (Root References)
    1. Starting from GcRoot to search downwards, the path traveled by the search is called a reference chain. When an object is not connected to GcRoot by any reference chain, it proves that the object is unavailable, and the virtual machine can decide to recycle.
    2. So what are the GcRoots?
      1. Objects referenced in the virtual machine stack
      2. The object referenced by the static property in the method area.
      3. Objects referenced by constants in the method area
      4. The object referenced in the local method stack (that is, the native method in general)
  2. In addition, the recovery mechanisms of different reference types are different
    1. Strong reference: The object passed through the keyword new is a strong reference object, and the object pointed to by the strong reference will not be recycled at any time, and would rather not be recycled by OOM.
    2. Soft reference: If an object holds a soft reference, it will be recycled when the JVM heap space is insufficient. A soft reference to a class can be held via java.lang.ref.SoftReference.
    3. Weak reference: If an object holds a weak reference, it will be recycled as long as the weak reference object is found during GC. A weak reference to a class can be held via java.lang.ref.WeakReference.
    4. Phantom references: Almost nothing, ready to be recycled. Hold by PhantomReference.

How to reclaim memory objects, what are the recycling algorithms?

1. Mark-clear (Mark-Sweep) algorithm

It is divided into two stages of marking and clearing: first mark all objects that need to be recycled, and recycle all marked objects uniformly after the marking is completed. [External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-CubqTrTX-1662355491067)(images/before.png)]

It has two main shortcomings:

  • Efficiency issues, the efficiency of the two processes of marking and clearing is not high.
  • Space problem, a large number of discontinuous memory fragments will be generated after the mark is cleared. Too much space fragmentation may cause that when the program needs to allocate large objects in the future, it will not be able to find enough continuous memory and have to trigger another garbage collection in advance. action.
  1. copy algorithm

In order to solve the efficiency problem, a collection algorithm called copying (Copying) appeared, which divides the available memory into two pieces of equal size according to the capacity, and only uses one of them at a time. When the memory of this block is used up, copy the surviving object to another block, and then clean up the used memory space at one time. In this way, the entire half area is reclaimed every time, and there is no need to consider complex situations such as memory fragmentation when allocating memory. You only need to move the pointer on the top of the heap and allocate memory in order, which is simple to implement and efficient to operate.
[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-9hCmmQL4-1662355491067)(images/copy.png)] The cost of the copy algorithm is to reduce the memory to half of
the original , reducing the actual available memory. Today's commercial virtual machines use this collection algorithm to recycle the new generation. IBM's special research shows that 98% of the objects in the new generation are "live and die", so there is no need to divide them according to the ratio of 1:1. Instead, the memory is divided into a larger Eden space and two smaller Survivor spaces, and Eden and one of the Survivor spaces are used each time. When recycling, copy the surviving objects in Eden and Survivor to another Survivor space at one time, and finally clean up Eden and the Survivor space just used. The default ratio of Eden to Survivor for the HotSpot virtual machine is 8:1, that is, the available memory space in each new generation is 90% (80%+10%) of the entire new generation capacity, and only 10% of the memory will be "wasted". ". Of course, 98% of the recyclable objects are only data in general scenarios. We have no way to guarantee that no more than 10% of the objects will survive each recycle. When the Survivor space is not enough, we need to rely on other memory (here refers to the old generation) to carry out Assignment guarantee (Handle Promotion).

  1. Mark-Collating Algorithm

The copy collection algorithm will perform more copy operations when the object survival rate is high, and the efficiency will become lower. More importantly, if you don't want to waste 50% of the space, you need to have additional space for allocation guarantees to deal with the extreme situation where all objects in the used memory are 100% alive, so you generally can't directly choose this in the old generation algorithm. According to the characteristics of the old generation, someone proposed another mark-compact (Mark-Compact) algorithm. The mark process is still the same as the mark-clear algorithm, but the next step is not to clean up the recyclable objects directly, but to let all surviving Objects are moved to one end, and then directly clean up the memory outside the end boundary.
[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-ZOEmY8xH-1662355491067)(images/3-1621487892206.png)]

  1. Generational Collection Algorithm

The current garbage collection of commercial virtual machines adopts the Generational Collection (Generational Collection) algorithm. This algorithm does not have any new ideas, but divides the memory into several blocks according to the different life cycles of the objects. Generally, the Java heap is divided into the new generation and the old generation, so that the most appropriate collection algorithm can be adopted according to the characteristics of each age. In the new generation, it is found that a large number of objects die and only a small number of objects survive each time garbage is collected. Then, a copy algorithm is used, and the collection can be completed only by paying the cost of copying a small number of surviving objects. In the old generation, because the object has a high survival rate and there is no additional space for its allocation guarantee, it is necessary to use a mark-clean or mark-organize algorithm for recycling.

What garbage collectors does jvm have, and how to choose them in practice?

[External link image transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the image and upload it directly (img-0sv85lgb-1662355491068) (images/gcollector.png)] The figure shows 7 kinds of different
generations Collectors, if there is a line between two collectors, they can be used together. The region the virtual machine is in indicates whether it belongs to the young generation or the old generation collector.
New generation collectors (all of which are copy algorithms): Serial, ParNew, Parallel Scavenge
Old generation collectors: CMS (mark-clean), Serial Old (mark-organize), Parallel Old (mark-organize) Whole heap
collectors: G1 (a mark-sweep algorithm in one Region, and a copy algorithm between two Regions)
At the same time, let’s explain a few terms first:
1. Parallel (Parallel) : multiple garbage collection threads work in parallel, and the user thread is in a waiting state at this time
2. Concurrent : user threads and garbage collection threads execute simultaneously
3. Throughput : running user code time/(running user code time + garbage collection time)
1.Serial collector is the most basic and has the longest development history collector.
**Features: **Single-threaded, simple and efficient (compared to other single-threaded collectors), for an environment limited to a single CPU, the Serial collector has no thread interaction overhead, so you can naturally get the highest garbage collection by concentrating on garbage collection. single-threaded phone efficiency. When the collector performs garbage collection, it must suspend all other worker threads until it ends (Stop The World).
Application Scenario: Applicable to virtual machines in Client mode.
Schematic diagram of the operation of the Serial / Serial Old collector
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-MnwsJ3YB-1662355491068)(images/serial.png)] 2. ParNew
collection The collector is actually a multi-threaded version of the Serial collector.
Except for the use of multithreading, the rest of the behavior is exactly the same as the Serial collector (parameter control, collection algorithm, Stop The World, object allocation rules, recycling strategies, etc.).
Features : Multi-threading, the number of collection threads enabled by the ParNew collector by default is the same as the number of CPUs. In an environment with a lot of CPUs, you can use the -XX:ParallelGCThreads parameter to limit the number of threads for garbage collection.
   Same as the Serial collector, there is a Stop The World problem.
Application scenarios : The ParNew collector is the preferred new generation collector for many virtual machines running in Server mode, because it is the only one that can work with the CMS collector except the Serial collector. Cooperate with work.
The operation diagram of the ParNew/Serial Old combination collector is as follows:
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-5Ol3aUJp-1662355491068)(images/parnew.png)]
3 The .Parallel Scavenge collector is closely related to throughput, so it is also called a throughput-first collector.
Features : The new generation collector is also a collector using the copy algorithm, and it is also a parallel multi-threaded collector (similar to the ParNew collector).
The goal of the collector is to achieve a manageable throughput. Another point worthy of attention is: GC adaptive adjustment strategy (the most important difference from the ParNew collector)
GC adaptive adjustment strategy : Parallel Scavenge collector can set the -XX:+UseAdptiveSizePolicy parameter. When the switch is turned on, there is no need to manually specify the size of the new generation (-Xmn), the ratio of Eden to the Survivor area (-XX:SurvivorRation), the age of objects promoted to the old generation (-XX:PretenureSizeThreshold), etc., the virtual machine runs according to the system Status collects performance monitoring information, and dynamically sets these parameters to provide optimal pause time and highest throughput. This adjustment method is called GC's adaptive adjustment strategy.
The Parallel Scavenge collector controls throughput using two parameters:

  • XX:MaxGCPauseMillis controls the maximum garbage collection pause time
  • XX:GCRatio directly sets the size of the throughput.

4.Serial Old is the old version of the Serial collector.
Features : It is also a single-threaded collector, using a mark-sort algorithm.
Application scenario : It is mainly used in the virtual machine in Client mode. Can also be used in Server mode.
There are two main uses in Server mode (explain in detail in the follow-up...):

  1. Used in conjunction with the Parallel Scavenge collector in JDK1.5 and earlier versions.
  2. As a backup solution for the CMS collector, it is used when Concurent Mode Failure is collected concurrently.

Serial / Serial Old collector working process diagram (Serial collector icon is the same):
[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-y0H2ZdA5-1662355491068)(images /serial-old.png)]
5.Parallel Old is the old version of the Parallel Scavenge collector.
Features : multi-threaded, using mark-sort algorithm.
Application Scenarios : Parallel Scavenge+Parallel Old collectors can be given priority in occasions that focus on high throughput and CPU resource sensitivity.
Parallel Scavenge/Parallel Old collector working process diagram:
6. The CMS collector is a collector whose goal is to obtain the shortest recovery pause time.
Features : Implemented based on the mark-sweep algorithm. Concurrent collection, low pause.
Application Scenario : Applicable to the scenarios where the response speed of the service is emphasized, the system pause time is expected to be the shortest, and a better experience is brought to the user. Such as web programs, b/s services.
The operation process of the CMS collector is divided into the following 4 steps:
Initial mark : mark the objects that GC Roots can directly reach. It's fast but still has the Stop The World problem.
Concurrent marking : the process of GC Roots Tracing to find surviving objects and user threads can execute concurrently.
Re-marking : In order to correct the mark record of the part of the object whose mark changes due to the continued operation of the user program during the concurrent mark. There is still the Stop The World problem.
Concurrent cleanup : Clear and recycle marked objects.
The memory recovery process of the CMS collector is executed concurrently with user threads.
The working process diagram of the CMS collector:
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-ZxJmtrqv-1662355491069)(images/cms.png)] CMS
collector shortcoming:

  • Very sensitive to CPU resources.
  • Unable to handle floating garbage, Concurrent Model Failure may occur and cause another Full GC to occur.
  • Because the mark-clear algorithm is used, there will be space fragmentation problems, which will cause large objects to fail to allocate space and have to trigger a Full GC in advance. [External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-yeyh0bkl-1662355491069)(images/cms2.png)]

7. The G1 collector is a garbage collector for server-side applications.
The features are as follows:
Parallel and concurrent: G1 can make full use of the hardware advantages in a multi-CPU and multi-core environment, and use multiple CPUs to shorten the Stop-The-World pause time. Some collectors originally need to stop the Java thread to perform GC actions, and the G1 collector can still allow the Java program to continue running in a concurrent manner.
Generational collection: G1 can independently manage the entire Java heap, and use different methods to deal with newly created objects and old objects that have survived for a period of time and have survived multiple GCs to obtain better collection results.
Space integration: G1 will not generate space fragments during operation, and can provide regular available memory after collection.
Predictable pause: In addition to pursuing low pause, G1 can also build a predictable pause time model. Allows the user to explicitly specify that within a time period of M milliseconds, the time spent on garbage collection shall not exceed N milliseconds.
Schematic diagram of the operation of the G1 collector:
[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-36nWucES-1662355491069)(images/g1.png)
]

Regarding the choice of gc
Unless the application has very strict pause time requirements, please run the application first and allow the VM to select the collector (if there is no special requirement. Just use the default GC provided by the VM).
If necessary, adjust the heap size to improve performance. If performance still does not meet goals, use the following guidelines as a starting point for selecting a collector:

  • If the data set of the application is small (up to about 100 MB), select the serial collector with option -XX:+UseSerialGC.
  • If the application will run on a single processor and has no pause time requirements, select the serial collector with option -XX:+UseSerialGC.
  • If (a) peak application performance is the first priority, and (b) there are no pause time requirements or pauses of a second or more are acceptable, then let the VM choose the collector or use -XX:+UseParallelGC to choose the parallel collector .
  • Choose to have -XX:+UseG1GC if response time is more important than overall throughput and garbage collection pauses must be kept within about a second. (It is worth noting that CMS in JDK9 has been Deprecated and cannot be used! Remove this option)
  • If jdk8 is used and the heap memory reaches 16G, it is recommended to use the G1 collector to control the time of each garbage collection.
  • If response time is a high priority, or if the used heap is very large, use -XX:UseZGC to select a fully concurrent collector. (It is worth noting that JDK11 can start ZGC, but ZGC is experimental at this time. In JDK15 [released in 202009], the experimental label was canceled and it can be directly displayed and enabled, but the default GC of JDK15 is still G1)

These guidelines provide only a starting point for selecting a collector, since performance depends on the size of the heap, the amount of real-time data maintained by the application, and the number and speed of available processors.
If the recommended collectors do not achieve the desired performance, first try tuning the heap and young generation sizes to achieve the desired goals. If the performance is still insufficient, try to use other collectors
General principles : reduce STOP THE WORD time, use concurrent collectors (such as CMS+ParNew, G1) to reduce pause time, speed up response time, and use parallel collectors to increase multiprocessors Overall throughput on hardware.

Why does JVM8 increase the metaspace?

Reasons:
1. The string exists in the permanent generation, which is prone to performance problems and memory overflow.
2. It is difficult to determine the size of the class and method information, so it is difficult to specify the size of the permanent generation. If it is too small, permanent generation overflow will easily occur, and if it is too large, it will easily cause old generation overflow.
3. The permanent generation will bring unnecessary complexity to the GC, and the recycling efficiency is low.

What are the characteristics of metaspace in JVM8?

1. Each loader has a dedicated storage space.
2. A certain class will not be recycled separately.
3. The position of the object in the metaspace is fixed.
4. If it is found that a certain loader is no longer in stock, the relevant space will be reclaimed entirely

How to solve the problem of frequent online gc?

  1. Check the monitoring to know when the problem occurred and the current frequency of FGC (compared with the normal situation to see if the frequency is normal)
  2. Find out whether there are programs online, basic component upgrades, etc. before this point in time.
  3. Understand the parameter settings of the JVM, including: the size settings of each area of ​​the heap space, which garbage collectors are used in the new generation and the old generation, and then analyze whether the JVM parameter settings are reasonable.
  4. Then eliminate the possible causes listed in step 1. Among them, the metaspace is full, memory leaks, and the code explicitly calls the gc method, which is easier to troubleshoot.
  5. For the FGC caused by large objects or objects with a long life cycle, you can use the jmap -histo command and combine the dump heap memory file for further analysis, and you need to locate the suspicious object first.
  6. Locate the suspicious object to the specific code and analyze it again. At this time, it is necessary to combine the GC principle and JVM parameter settings to find out whether the suspicious object meets the conditions for entering the old age before drawing a conclusion.

What are the causes of memory overflow, and how to troubleshoot online problems?

  1. java.lang.OutOfMemoryError: ...java heap space... stack overflow, code problem is very likely
  2. java.lang.OutOfMemoryError: GC over head limit exceeded When the system is in a high-frequency GC state and the recovery effect is still not good, this error will start to be reported. In this case, many objects that cannot be released are generally generated. , it may be caused by improper use of references, or the application of large objects, but the memory overflow of java heap space may not report this error in advance, that is, it may be caused by insufficient memory directly, rather than high-frequency GC.
  3. java.lang.OutOfMemoryError: PermGen space jdk1.7 before the problem, the reason is that the system has a lot of code or a lot of third-party packages referenced, or a lot of constants are used in the code, or constants are injected through intern, or Through methods such as dynamic code loading, the expansion of the constant pool is caused
  4. java.lang.OutOfMemoryError: Direct buffer memory The direct memory is insufficient, because jvm garbage collection will not reclaim the memory of the direct memory, so the possible reason is that when the allocateDirect method in ByteBuffer is used directly or indirectly, there is no clear
  5. java.lang.StackOverflowError - Xss set too small
  6. java.lang.OutOfMemoryError: unable to create new native thread Insufficient off-heap memory, unable to allocate memory area for thread
  7. java.lang.OutOfMemoryError: request {} byte for {}out of swap address space is not enough

What is the Happens-Before rule?

  1. Program order rules: every operation in a thread happens-before any subsequent operation in that thread.
  2. Monitor rules: Unlocking a lock happens-before subsequent locking of the lock.
  3. Volatile rule: A write to a volatile variable happens-before any subsequent read of a volatile variable.
  4. Transitivity: If A happens-before B, B happens-before C, then A happens-before C.
  5. Thread start rules: the start() method of the Thread object, happens-before any subsequent operation of this thread.
  6. Thread termination rules: Any operation in a thread, happens-before is monitored for the termination of the thread. We can detect that the thread has terminated execution by means of the end of the Thread.join() method and the return value of Thread.isAlive().
  7. Thread interrupt operation: the call to the thread interrupt() method, happens-before the code of the interrupted thread detects the occurrence of an interrupt event, you can use the Thread.interrupted() method to detect whether the thread has been interrupted.
  8. Object finalization rules: The initialization of an object is completed, and happens-before starts at the beginning of the finalize() method of this object.

Introduce the life cycle and status of threads?

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-6jUAd6xH-1662355491069)(images/life.jpg)] 1. When creating a program, use the new keyword to
create
a After the thread is created, the thread is in a newly created state (initial state). At this time, it is the same as other Java objects, only memory is allocated for it by the Java virtual machine, and its member variable values ​​​​are initialized. At this time, the thread object does not show any dynamic characteristics of the thread, and the program will not execute the thread execution body of the thread.
2. Ready
When the thread object calls the Thread.start() method, the thread is in the ready state. The Java virtual machine creates a method call stack and a program counter for it. The thread in this state does not start running, it just means that the thread can run. It can be seen from the source code of start() that it is added to the thread list after start, and then added to the VM in the native layer. As for when the thread starts running, it depends on the scheduling of the thread scheduler in the JVM (if OS scheduling is selected, will enter the running state).
3. Running
When the thread object calls the Thread.start() method, the thread is in the ready state. Added to the thread list, if the OS scheduling is selected, it will enter the running state
4. Blocking
The blocking state is that the thread gives up the right to use the CPU for some reason and temporarily stops running. Until the thread enters the ready state, it has the opportunity to go to the running state. There are roughly three types of blocking:

  • 1. Waiting for blocking : The running thread executes the wait() method, and the JVM will put the thread into the waiting pool. (wait will release the lock held)
  • 2. Synchronous blocking : When the running thread acquires the synchronization lock of the object, if the synchronization lock is occupied by other threads, the JVM will put the thread into the lock pool.
  • 3. Other blocking : When a running thread executes the sleep() or join() method, or sends an I/O request, the JVM will put the thread in a blocked state. When the sleep () state times out, the join () waits for the thread to terminate or time out, or when the I/O processing is completed, the thread is transferred to the ready state again. (Note that sleep will not release the held lock).
  • Thread sleep: The Thread.sleep(long millis) method makes the thread go to the blocked state. The millis parameter sets the time to sleep, in milliseconds. When the sleep is over, it turns to the ready (Runnable) state. sleep() has good platform portability.
  • Thread waiting: The wait() method in the Object class causes the current thread to wait until other threads call the notify() method or notifyAll() wake-up method of this object. These two wake-up methods are also methods in the Object class, and their behavior is equivalent to calling wait(0). After waking up the thread, it becomes ready (Runnable) state.
  • Thread yield: The Thread.yield() method suspends the currently executing thread object and gives the execution opportunity to a thread with the same or higher priority.
  • Thread join: join() method, wait for other threads to terminate. If the join() method of another thread is called in the current thread, the current thread will turn into a blocked state until the other process finishes running, and then the current thread will turn from blocked to ready state.
  • Thread I/O: The thread performs some IO operations and enters a blocked state because it is waiting for related resources. For example, if you listen to system.in, but you haven’t received any keyboard input, you will enter a blocking state.
  • Thread wakeup: The notify() method in the Object class wakes up a single thread waiting on the monitor of this object. If all threads are waiting on this object, one of them will be chosen to wake up, the choice is arbitrary and happens at implementation time. A similar method also has a notifyAll(), which wakes up all threads waiting on this object monitor.

5. Death
The thread will end in one of the following three ways, and will be in a dead state after the end:

  • The run() method is executed and the thread ends normally.
  • The thread throws an uncaught Exception or Error.
  • Call the thread's stop() method directly to end the thread - this method is prone to deadlock and is generally not recommended

How to use sleep, wait, join and yield of threads?

sleep : Let the thread sleep, during which the cpu will be released. In the synchronization code block, the lock will not be released.
wait (you must obtain the corresponding lock before calling): let the thread enter the waiting state, and release the lock resource thread held by the current thread. The notify or notifyAll method will be woken up after calling, and then compete for the lock.
join : Coordination between threads, usage scenario: thread A must wait for thread B to finish running before it can execute, then you can add it to the code of thread A ThreadB.join();
yield : Returns the currently running thread to a runnable state to allow other threads of the same priority to get a chance to run. Therefore, the purpose of using yield() is to allow appropriate rotation between threads with the same priority. However, in practice, there is no guarantee that yield() will achieve the purpose of yielding, because the yielding thread may be selected again by the thread scheduler.

What are the ways to create threads?

1) Inherit the Thread class to create threads
2) Implement the Runnable interface to create threads
3) Use Callable and Future to create threads
4) Use thread pools such as the Executor framework

What is a daemon thread?

在Java中有两类线程:User Thread(用户线程)、Daemon Thread(守护线程)
任何一个守护线程都是整个JVM中所有非守护线程的保姆:
只要当前JVM实例中尚存在任何一个非守护线程没有结束,守护线程就全部工作;只有当最后一个非守护线程结束时,守护线程随着JVM一同结束工作。Daemon的作用是为其他线程的运行提供便利服务,守护线程最典型的应用就是 GC (垃圾回收器),它就是一个很称职的守护者。
User和Daemon两者几乎没有区别,唯一的不同之处就在于虚拟机的离开:如果 User Thread已经全部退出运行了,只剩下Daemon Thread存在了,虚拟机也就退出了。 因为没有了被守护者,Daemon也就没有工作可做了,也就没有继续运行程序的必要了。
注意事项:
(1) thread.setDaemon(true)必须在thread.start()之前设置,否则会出现一个IllegalThreadStateException异常。只能在线程未开始运行之前设置为守护线程。
(2) 在Daemon线程中产生的新线程也是Daemon的。
(3) 不要认为所有的应用都可以分配给Daemon来进行读写操作或者计算逻辑,因为这会可能回到数据不一致的状态。

ThreadLocal的原理是什么,使用场景有哪些?

Thread类中有两个变量threadLocals和inheritableThreadLocals,二者都是ThreadLocal内部类ThreadLocalMap类型的变量,我们通过查看内部内ThreadLocalMap可以发现实际上它类似于一个HashMap。在默认情况下,每个线程中的这两个变量都为null:

ThreadLocal.ThreadLocalMap threadLocals = null;
ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;

只有当线程第一次调用ThreadLocal的set或者get方法的时候才会创建他们。

public T get() {
    
    
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
    
    
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
    
    
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
}
    
ThreadLocalMap getMap(Thread t) {
    
    
        return t.threadLocals;
}

In addition, the local variables of each thread are not stored in the ThreadLocal instance, but in the ThreadLocals variable of the calling thread. That is to say, the local variable of ThreadLocal type is stored in a specific thread space , which itself is equivalent to a carrier for loading local variables. The value is added to the threadLocals of the calling thread through the set method. When the calling thread calls the get method, it can Take variables out of its threadLocals. If the calling thread never terminates, then this local variable will always be stored in its threadLocals, so when not using local variables, you need to call the remove method to delete unused local variables from threadLocals to prevent memory leaks.

public void set(T value) {
    
    
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
}
public void remove() {
    
    
         ThreadLocalMap m = getMap(Thread.currentThread());
         if (m != null)
             m.remove(this);
}

What memory leaks does ThreadLocal have and how to avoid them?

Each Thread has a ThreadLocal.ThreadLocalMap map. The key of the map is the ThreadLocal instance, which is a weak reference. We know that weak references are beneficial to GC recovery. When the key of ThreadLocal == null, the GC will reclaim this part of the space, but the value may not be reclaimed, because it still has a strong reference relationship with the Current Thread, as follows

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-8vAmxjAn-1662355491070)(images/threadlocal.png)] Due to the existence of this strong reference relationship, the value cannot be recycled . If this thread object will not be destroyed, then this strong reference relationship will always exist, and a memory leak will occur. So as long as this thread object can be recycled by GC in time, there will be no memory leak. If you encounter the thread pool, it will be even worse. So how to avoid this problem? As mentioned earlier, in setEntry() and getEntry() in ThreadLocalMap, if key == null, the value will be set to null. Of course, we can also explicitly call the remove() method of ThreadLocal for processing. The following is a brief summary of ThreadLocal:

  • ThreadLocal is not used to solve the problem of shared variables, nor does it exist to coordinate thread synchronization, but is a mechanism introduced to facilitate each thread to process its own state. This is crucial.
  • Each Thread has a member variable of type ThreadLocal.ThreadLocalMap, which is used to store the actual copy of the ThreadLocal variable.
  • ThreadLocal does not save a copy of the object for the thread, it only acts as an index. Its main idea is that each thread isolates an instance of a class, and the scope of this instance is limited to the inside of the thread.

Why use a thread pool?

In order to reduce the number of threads created and destroyed, so that each thread can be used multiple times, the number of executed threads can be adjusted according to the system conditions to prevent excessive memory consumption, so we can use the thread pool.

What is the principle of thread pool thread reuse?

Think about such a question: Will the thread be recycled after the task ends?
The answer is: allowCoreThreadTimeOut control

/java/util/concurrent/ThreadPoolExecutor.java:1127
final void runWorker(Worker w) {
    
    
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
    
    
            while (task != null || (task = getTask()) != null) {
    
    ...执行任务...}
            completedAbruptly = false;
        } finally {
    
    
            processWorkerExit(w, completedAbruptly);
        }
    }
首先线程池内的线程都被包装成了一个个的java.util.concurrent.ThreadPoolExecutor.Worker,然后这个worker会马不停蹄的执行任务,执行完任务之后就会在while循环中去取任务,取到任务就继续执行,取不到任务就跳出while循环(这个时候worker就不能再执行任务了)执行 processWorkerExit方法,这个方法呢就是做清场处理,将当前woker线程从线程池中移除,并且判断是否是异常的进入processWorkerExit方法,如果是非异常情况,就对当前线程池状态(RUNNING,shutdown)和当前工作线程数和当前任务数做判断,是否要加入一个新的线程去完成最后的任务(防止没有线程去做剩下的任务).
那么什么时候会退出while循环呢?取不到任务的时候(getTask() == null).下面看一下getTask方法

private Runnable getTask() {
    
    
        boolean timedOut = false; // Did the last poll() time out?

        for (;;) {
    
    
            int c = ctl.get();
            int rs = runStateOf(c);

            //(rs == SHUTDOWN && workQueue.isEmpty()) || rs >=STOP
            //若线程池状态是SHUTDOWN 并且 任务队列为空,意味着已经不需要工作线程执行任务了,线程池即将关闭
            //若线程池的状态是 STOP TIDYING TERMINATED,则意味着线程池已经停止处理任何任务了,不在需要线程
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
    
    
            	//把此工作线程从线程池中删除
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            //allowCoreThreadTimeOut:当没有任务的时候,核心线程数也会被剔除,默认参数是false,官方推荐在创建线程池并且还未使用的时候,设置此值
            //如果当前工作线程数 大于 核心线程数,timed为true
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
			
            //(wc > maximumPoolSize || (timed && timedOut)):当工作线程超过最大线程数,或者 允许超时并且超时过一次了
            //(wc > 1 || workQueue.isEmpty()):工作线程数至少为1个 或者 没有任务了
            //总的来说判断当前工作线程还有没有必要等着拿任务去执行
            //wc > maximumPoolSize && wc>1 : 就是判断当前工作线程是否超过最大值
            //或者 wc > maximumPoolSize && workQueue.isEmpty():工作线程超过最大,基本上不会走到这,
            //		如果走到这,则意味着wc=1 ,只有1个工作线程了,如果此时任务队列是空的,则把最后的线程删除
            //或者(timed && timedOut) && wc>1:如果允许超时并且超时过一次,并且至少有1个线程,则删除线程
            //或者 (timed && timedOut) && workQueue.isEmpty():如果允许超时并且超时过一次,并且此时工作					队列为空,那么妥妥可以把最后一个线程(因为上面的wc>1不满足,则可以得出来wc=1)删除
            if ((wc > maximumPoolSize  || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
    
    
                if (compareAndDecrementWorkerCount(c))
                	//如果减去工作线程数成功,则返回null出去,也就是说 让工作线程停止while轮训,进行收尾
                    return null;
                continue;
            }

            try {
    
    
            	//判断是否要阻塞获取任务
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;
            } catch (InterruptedException retry) {
    
    
                timedOut = false;
            }
        }
    }
    
//综上所述,如果allowCoreThreadTimeOut为true,并且在第1次阻塞获取任务失败了,那么当前getTask会返回null,不管是不是核心线程;那么runWorker中将推出while循环,也就意味着当前工作线程被销毁

A conclusion can be drawn from the above question: when your thread pool parameters are configured reasonably, the thread that has executed the task will not be destroyed, but will take the task out of the task queue and continue to execute!

How to prevent deadlock?

  1. First of all, it is necessary to tell the deadlock that it is a necessary condition:
    1. Mutual exclusion conditions allow only one thread to acquire a resource at a time.
    2. Non-preemptive conditions A resource already occupied by a thread will not be preempted by other threads until it is released
    3. Request and hold condition threads will not release occupied resources while waiting
    4. Circular wait condition Multiple threads wait for each other to release resources
  2. Deadlock prevention, then it is necessary to destroy these four necessary conditions
    1. Since resource mutual exclusion is an inherent feature of resource usage and cannot be changed, we will not discuss
    2. breach of the inalienable condition
      1. When a process cannot obtain all the resources it needs, it will be in a waiting state. During the waiting period, the resources it occupies will be implicitly released and re-added to the system resource list, which can be used by other processes. The waiting process can only regain Only the original resources and newly applied resources can be restarted and executed
  3. Destruction request and hold condition
    1. The first method is static allocation, that is, each process applies for all the resources it needs when it starts executing
    2. The second is dynamic allocation, that is, each process does not occupy system resources when applying for the required resources.
  4. break loop wait condition
    1. The basic idea of ​​using orderly allocation of resources is to number all resources in the system sequentially, and use larger numbers for scarce and rare ones. When applying for resources, it must be done in the order of numbers. A process can only obtain processes with smaller numbers. In order to apply for a process with a larger number.

Describe the thread-safe active state problem?

Thread-safe activity problems can be divided into deadlock, livelock, starvation

  1. Livelock means that sometimes although the thread is not blocked, there will still be situations where execution cannot continue. Livelock will not block the thread, and the thread will always perform the same operation repeatedly, and fail to retry all the time.
    1. The asynchronous message queue used in our development may cause the problem of livelock. If there is no correct ack message on the consumer side of the message queue, and an error is reported during execution, the message header will be put back again, and then taken out for execution , has been repeatedly failing. In addition to the correct ack, this problem is often solved by putting the failed message into the delay queue and waiting for a certain delay before retrying.
    2. The solution to livelock is very simple, just try to wait for a random time, and retry according to the time round
  2. Starvation is a situation in which a thread cannot continue executing because it cannot access the resources it needs
    1. There are two types of starvation:
      1. One is that other threads perform infinite loops or infinitely waiting for resources in the critical section, so that other threads cannot get the lock and enter the critical section. For other threads, they enter a starvation state
      2. The other is that due to the unreasonable allocation of thread priorities, some threads cannot always obtain CPU resources and cannot execute
    2. There are several solutions to the problem of starvation:
      1. Ensure sufficient resources. In many scenarios, the scarcity of resources cannot be solved
      2. Fairly allocate resources, use fair locks in concurrent programming, such as FIFO strategy, threads wait in order, and threads in the front of the waiting queue will get resources first
      3. Avoid long-term execution of threads holding locks. In many scenarios, it is difficult to shorten the execution time of threads holding locks
  3. When deadlock threads compete for the same lock, the thread that has not preempted the lock will wait for the thread holding the lock to release the lock and continue to preempt. If two or more threads hold the lock that the other party will preempt, Waiting for each other to release the lock first will enter a process of circular waiting, which is called deadlock

What are the thread-safe race conditions?

  1. The same program accesses the same resource with multiple threads. If it is sensitive to the access sequence of resources, it is said that there is a race condition, and the code area becomes a critical area. Like most concurrency bugs, race conditions are not always problematic and require improper execution timing
  2. The most common race condition is
    1. Execution after detection depends on the detection results, and the detection results depend on the execution timing of multiple threads, and the execution timing of multiple threads is usually not fixed and cannot be judged, which leads to various problems in the execution results, see One possible solution is: when a thread modifies and accesses a state, it is necessary to prevent other threads from accessing the modification, that is, a locking mechanism to ensure atomicity
    2. Lazy initialization (typically a singleton)

How many threads are appropriate for the program?

  1. For CPU-intensive programs, a complete request and I/O operations can be completed in a short time, and the CPU still has a lot of calculations to process, which means that the proportion of CPU calculations accounts for a large part, and the thread waiting time is close to 0
    1. Single-core CPU: For a complete request, the I/O operation can be completed in a short period of time, and the CPU still has a lot of calculations to process, which means that the proportion of CPU calculations accounts for a large part, and the thread waiting time is close to 0. A single-core CPU handles CPU-intensive programs, which are not well suited for multithreading.
    2. Multi-core: If a multi-core CPU handles CPU-intensive programs, we can fully utilize the number of CPU cores and apply concurrent programming to improve efficiency. The optimal number of threads for a CPU-intensive program is: theoretically, the number of threads = the number of CPU cores (logic), but in practice, the number is generally set to the number of CPU cores (logic) + 1 (empirical value), calculation (CPU) intensive If a thread of type happens to be suspended at some point because of a page fault or other reason, there is just an "extra" thread that ensures that CPU cycles are not interrupted in this case.
  2. I/O-intensive programs, as opposed to CPU-intensive programs, a complete request, after the CPU operation is completed, there are still many I/O operations to be done, that is to say, I/O operations account for a large part, and the waiting time is longer , the higher the proportion of thread waiting time, the more threads are required; the higher the proportion of thread CPU time, the less threads are required
    1. The optimal number of threads for I/O-intensive programs is: Optimal number of threads = number of CPU cores (1/CPU utilization) = number of CPU cores (1 + (I/O time-consuming/CPU time-consuming))
    2. If it is almost all I/O time-consuming, then the CPU time-consuming will be infinitely close to 0, so in pure theory you can say that it is 2N (N=CPU core number), of course there are also 2N + 1, 1 should be backup
    3. Generally we say 2N + 1 is enough

What is the difference between synchronized and lock?

difference type synchronized Lock
level of existence Java keywords, at the jvm level is an interface to the JVM
lock acquisition Suppose thread A acquires the lock and thread B waits. If thread A is blocked, thread B will wait forever Depending on the situation, Lock has multiple ways to acquire locks. Basically, you can try to acquire locks, and threads don't have to wait all the time (you can use tryLock to determine whether there is a lock)
lock release 1. After executing the synchronization code with the thread that acquires the lock, release the lock 2. If an exception occurs in thread execution, jvm will let the thread release The lock must be released in finally, otherwise it is easy to cause thread deadlock
lock type Locks are reentrant, uninterruptible, and unfair reentrant, determinable, and fair (both)
performance few syncs Good for lots of syncs
Scenarios that support locks 1. Exclusive lock 1. Fair lock and unfair lock

Have you ever encountered an ABA problem? Tell me in detail?

  1. 有两个线程同时去修改一个变量的值,比如线程1、线程2,都更新变量值,将变量值从A更新成B。
  2. 首先线程1获取到CPU的时间片,线程2由于某些原因发生阻塞进行等待,此时线程1进行比较更新(CompareAndSwap),成功将变量的值从A更新成B。
  3. 更新完毕之后,恰好又有线程3进来想要把变量的值从B更新成A,线程3进行比较更新,成功将变量的值从B更新成A。
  4. 线程2获取到CPU的时间片,然后进行比较更新,发现值是预期的A,然后有更新成了B。但是线程1并不知道,该值已经有了A->B->A这个过程,这也就是我们常说的ABA问题。

volatile的可见性和禁止指令重排序怎么实现的?

  • 可见性:
    volatile的功能就是被修饰的变量在被修改后可以立即同步到主内存,被修饰的变量在每次是用之前都从主内存刷新。本质也是通过内存屏障来实现可见性
    写内存屏障(Store Memory Barrier)可以促使处理器将当前store buffer(存储缓存)的值写回主存。读内存屏障(Load Memory Barrier)可以促使处理器处理invalidate queue(失效队列)。进而避免由于Store Buffer和Invalidate Queue的非实时性带来的问题。
  • 禁止指令重排序:
    volatile是通过内存屏障来禁止指令重排序
    JMM内存屏障的策略
    • 在每个 volatile 写操作的前面插入一个 StoreStore 屏障。
    • 在每个 volatile 写操作的后面插入一个 StoreLoad 屏障。
    • 在每个 volatile 读操作的后面插入一个 LoadLoad 屏障。
    • 在每个 volatile 读操作的后面插入一个 LoadStore 屏障。

ConcurrentHashMap底层原理是什么?

1.7
数据结构:
内部主要是一个Segment数组,而数组的每一项又是一个HashEntry数组,元素都存在HashEntry数组里。因为每次锁定的是Segment对象,也就是整个HashEntry数组,所以又叫分段锁。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-UyN7RRBk-1662355491070)(images/1.7ConcurrentHashMap.png)]
1.8
数据结构:
与HashMap一样采用:数组+链表+红黑树
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-jUOEWSMx-1662355491070)(images/ConCurrentHashMap.png)]
底层原理则是采用锁链表或者红黑树头结点,相比于HashTable的方法锁,力度更细,是对数组(table)中的桶(链表或者红黑树)的头结点进行锁定,这样锁定,只会影响数组(table)当前下标的数据,不会影响其他下标节点的操作,可以提高读写效率。
putVal执行流程:

  1. 判断存储的key、value是否为空,若为空,则抛出异常
  2. 计算key的hash值,随后死循环(该循环可以确保成功插入,当满足适当条件时,会主动终止),判断table表为空或者长度为0,则初始化table表
  3. 根据hash值获取table中该下标对应的节点,如果该节点为空,则根据参数生成新的节点,并以CAS的方式进行更新,并终止死循环。
  4. 如果该节点的hash值是MOVED(-1),表示正在扩容,则辅助对该节点进行转移。
  5. 对数组(table)中的节点,即桶的头结点进行锁定,如果该节点的hash大于等于0,表示此桶是链表,然后对该桶进行遍历(死循环),寻找链表中与put的key的hash值相等,并且key相等的元素,然后进行值的替换,如果到链表尾部都没有符合条件的,就新建一个node,然后插入到该桶的尾部,并终止该循环遍历。
  6. 如果该节点的hash小于0,并且节点类型是TreeBin,则走红黑树的插入方式。
  7. 判断是否达到转化红黑树的阈值,如果达到阈值,则链表转化为红黑树。

分布式id生成方案有哪些?

UUID,数据库主键自增,Redis自增ID,雪花算法。

描述 优点 缺点
UUID UUID是通用唯一标识码的缩写,其目的是让分布式系统中的所有元素都有唯一的辨识信息,而不需要通过中央控制器来指定唯一标识。 1. 降低全局节点的压力,使得主键生成速度更快;
2. 生成的主键全局唯一;
3. 跨服务器合并数据方便。
1. UUID占用16个字符,空间占用较多;
2. 不是递增有序的数字,数据写入IO随机性很大,且索引效率下降
数据库主键自增 MySQL数据库设置主键且主键自动增长 1. INT和BIGINT类型占用空间较小;
2. 主键自动增长,IO写入连续性好;
3. 数字类型查询速度优于字符串
1. 并发性能不高,受限于数据库性能;
2. 分库分表,需要改造,复杂;
3. 自增:数据和数据量泄露
Redis自增 Redis计数器,原子性自增 使用内存,并发性能好 1. 数据丢失;
2. 自增:数据量泄露
雪花算法(snowflake) 大名鼎鼎的雪花算法,分布式ID的经典解决方案 1. 不依赖外部组件;
2. 性能好
时钟回拨

雪花算法生成的ID由哪些部分组成?

  1. 符号位,占用1位。
  2. 时间戳,占用41位,可以支持69年的时间跨度。
  3. 机器ID,占用10位。
  4. 序列号,占用12位。一毫秒可以生成4095个ID。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-z39P5ofv-1662355491070)(images/image-20210521124236027.png)]

分布式锁在项目中有哪些应用场景?

使用分布式锁的场景一般需要满足以下场景:

  1. 系统是一个分布式系统,集群集群,java的锁已经锁不住了。
  2. 操作共享资源,比如库里唯一的用户数据。
  3. 同步访问,即多个进程同时操作共享资源。

分布锁有哪些解决方案?

  1. Reids的分布式锁,很多大公司会基于Reidis做扩展开发。setnx key value ex 10s,Redisson。

    watch dog.

  2. 基于Zookeeper。临时节点,顺序节点。

  3. 基于数据库,比如Mysql。主键或唯一索引的唯一性。

Redis做分布式锁用什么命令?

SETNX
格式:setnx key value 将 key 的值设为 value ,当且仅当 key 不存在。
若给定的 key 已经存在,则 SETNX 不做任何动作,操作失败。

SETNX 是『SET if Not eXists』(如果不存在,则 SET)的简写。

加锁:set key value nx ex 10s

释放锁:delete key

Redis做分布式锁死锁有哪些情况,如何解决?

情况1:加锁,没有释放锁。需要加释放锁的操作。比如delete key。

情况2:加锁后,程序还没有执行释放锁,程序挂了。需要用的key的过期机制。

Redis如何做分布式锁?

假设有两个服务A、B都希望获得锁,执行过程大致如下:

Step1: 服务A为了获得锁,向Redis发起如下命令: SET productId:lock 0xx9p03001 NX EX 30000 其中,"productId"由自己定义,可以是与本次业务有关的id,"0xx9p03001"是一串随机值,必须保证全局唯一,“NX"指的是当且仅当key(也就是案例中的"productId:lock”)在Redis中不存在时,返回执行成功,否则执行失败。"EX 30000"指的是在30秒后,key将被自动删除。执行命令后返回成功,表明服务成功的获得了锁。

Step2: 服务B为了获得锁,向Redis发起同样的命令: SET productId:lock 0000111 NX EX 30000
由于Redis内已经存在同名key,且并未过期,因此命令执行失败,服务B未能获得锁。服务B进入循环请求状态,比如每隔1秒钟(自行设置)向Redis发送请求,直到执行成功并获得锁。

Step3: 服务A的业务代码执行时长超过了30秒,导致key超时,因此Redis自动删除了key。此时服务B再次发送命令执行成功,假设本次请求中设置的value值为0000222。此时需要在服务A中对key进行续期,watch dog。

Step4: 服务A执行完毕,为了释放锁,服务A会主动向Redis发起删除key的请求。注意: 在删除key之前,一定要判断服务A持有的value与Redis内存储的value是否一致。比如当前场景下,Redis中的锁早就不是服务A持有的那一把了,而是由服务2创建,如果贸然使用服务A持有的key来删除锁,则会误将服务2的锁释放掉。此外,由于删除锁时涉及到一系列判断逻辑,因此一般使用lua脚本,具体如下:

if redis.call("get", KEYS[1])==ARGV[1] then
	return redis.call("del", KEYS[1])
else
	return 0
end

基于 ZooKeeper 的分布式锁实现原理是什么?

顺序节点特性:

使用 ZooKeeper 的顺序节点特性,假如我们在/lock/目录下创建3个节点,ZK集群会按照发起创建的顺序来创建节点,节点分别为/lock/0000000001、/lock/0000000002、/lock/0000000003,最后一位数是依次递增的,节点名由zk来完成。

临时节点特性:

ZK中还有一种名为临时节点的节点,临时节点由某个客户端创建,当客户端与ZK集群断开连接,则该节点自动被删除。EPHEMERAL_SEQUENTIAL为临时顺序节点。

根据ZK中节点是否存在,可以作为分布式锁的锁状态,以此来实现一个分布式锁,下面是分布式锁的基本逻辑:

  1. 客户端1调用create()方法创建名为“/业务ID/lock-”的临时顺序节点。
  2. 客户端1调用getChildren(“业务ID”)方法来获取所有已经创建的子节点。
  3. 客户端获取到所有子节点path之后,如果发现自己在步骤1中创建的节点是所有节点中序号最小的,就是看自己创建的序列号是否排第一,如果是第一,那么就认为这个客户端1获得了锁,在它前面没有别的客户端拿到锁。
  4. 如果创建的节点不是所有节点中需要最小的,那么则监视比自己创建节点的序列号小的最大的节点,进入等待。直到下次监视的子节点变更的时候,再进行子节点的获取,判断是否获取锁。

ZooKeeper和Reids做分布式锁的区别?

Reids:

  1. Redis只保证最终一致性,副本间的数据复制是异步进行(Set是写,Get是读,Reids集群一般是读写分离架构,存在主从同步延迟情况),主从切换之后可能有部分数据没有复制过去可能会 「丢失锁」 情况,故强一致性要求的业务不推荐使用Reids,推荐使用zk。
  2. Redis集群各方法的响应时间均为最低。随着并发量和业务数量的提升其响应时间会有明显上升(公网集群影响因素偏大),但是极限qps可以达到最大且基本无异常

ZooKeeper:

  1. 使用ZooKeeper集群,锁原理是使用ZooKeeper的临时顺序节点,临时顺序节点的生命周期在Client与集群的Session结束时结束。因此如果某个Client节点存在网络问题,与ZooKeeper集群断开连接,Session超时同样会导致锁被错误的释放(导致被其他线程错误地持有),因此ZooKeeper也无法保证完全一致。
  2. ZK具有较好的稳定性;响应时间抖动很小,没有出现异常。但是随着并发量和业务数量的提升其响应时间和qps会明显下降。

总结:

  1. Zookeeper每次进行锁操作前都要创建若干节点,完成后要释放节点,会浪费很多时间;
  2. 而Redis只是简单的数据操作,没有这个问题。

MySQL如何做分布式锁?

在Mysql中创建一张表,设置一个 主键或者UNIQUE KEY 这个 KEY 就是要锁的 KEY(商品ID),所以同一个 KEY 在mysql表里只能插入一次了,这样对锁的竞争就交给了数据库,处理同一个 KEY 数据库保证了只有一个节点能插入成功,其他节点都会插入失败。

DB分布式锁的实现:通过主键id 或者 唯一索性 的唯一性进行加锁,说白了就是加锁的形式是向一张表中插入一条数据,该条数据的id就是一把分布式锁,例如当一次请求插入了一条id为1的数据,其他想要进行插入数据的并发请求必须等第一次请求执行完成后删除这条id为1的数据才能继续插入,实现了分布式锁的功能。

这样 lock 和 unlock 的思路就很简单了,伪代码:

def lock :
    exec sql: insert into locked—table (xxx) values (xxx)
    if result == true :
        return true
    else :
        return false

def unlock :
    exec sql: delete from lockedOrder where order_id='order_id'

计数器算法是什么?

​ 计数器算法,是指在指定的时间周期内累加访问次数,达到设定的阈值时,触发限流策略。下一个时间周期进行访问时,访问次数清零。此算法无论在单机还是分布式环境下实现都非常简单,使用redis的incr原子自增性,再结合key的过期时间,即可轻松实现。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-d1CFRBdI-1662355491071)(images/4-6 计数器算法-1621753094321.jpg)]

​ 从上图我们来看,我们设置一分钟的阈值是100,在0:00到1:00内请求数是60,当到1:00时,请求数清零,从0开始计算,这时在1:00到2:00之间我们能处理的最大的请求为100,超过100个的请求,系统都拒绝。

​ 这个算法有一个临界问题,比如在上图中,在0:00到1:00内,只在0:50有60个请求,而在1:00到2:00之间,只在1:10有60个请求,虽然在两个一分钟的时间内,都没有超过100个请求,但是在0:50到1:10这20秒内,确有120个请求,虽然在每个周期内,都没超过阈值,但是在这20秒内,已经远远超过了我们原来设置的1分钟内100个请求的阈值。

滑动时间窗口算法是什么?

​ 为了解决计数器算法的临界值的问题,发明了滑动窗口算法。在TCP网络通信协议中,就采用滑动时间窗口算法来解决网络拥堵问题。

​ 滑动时间窗口是将计数器算法中的实际周期切分成多个小的时间窗口,分别在每个小的时间窗口中记录访问次数,然后根据时间将窗口往前滑动并删除过期的小时间窗口。最终只需要统计滑动窗口范围内的小时间窗口的总的请求数即可。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-D07oqC5n-1662355491071)(images/4-7 滑动窗口算法-1621753118270.jpg)]

​ 在上图中,假设我们设置一分钟的请求阈值是100,我们将一分钟拆分成4个小时间窗口,这样,每个小的时间窗口只能处理25个请求,我们用虚线方框表示滑动时间窗口,当前窗口的大小是2,也就是在窗口内最多能处理50个请求。随着时间的推移,滑动窗口也随着时间往前移动,比如上图开始时,窗口是0:00到0:30的这个范围,过了15秒后,窗口是0:15到0:45的这个范围,窗口中的请求重新清零,这样就很好的解决了计数器算法的临界值问题。

​ 在滑动时间窗口算法中,我们的小窗口划分的越多,滑动窗口的滚动就越平滑,限流的统计就会越精确。

漏桶限流算法是什么?

​ 漏桶算法的原理就像它的名字一样,我们维持一个漏斗,它有恒定的流出速度,不管水流流入的速度有多快,漏斗出水的速度始终保持不变,类似于消息中间件,不管消息的生产者请求量有多大,消息的处理能力取决于消费者。

​ 漏桶的容量=漏桶的流出速度*可接受的等待时长。在这个容量范围内的请求可以排队等待系统的处理,超过这个容量的请求,才会被抛弃。

​ 在漏桶限流算法中,存在下面几种情况:

  1. 当请求速度大于漏桶的流出速度时,也就是请求量大于当前服务所能处理的最大极限值时,触发限流策略。

  2. 请求速度小于或等于漏桶的流出速度时,也就是服务的处理能力大于或等于请求量时,正常执行。

    漏桶算法有一个缺点:当系统在短时间内有突发的大流量时,漏桶算法处理不了。

令牌桶限流算法是什么?

​ 令牌桶算法,是增加一个大小固定的容器,也就是令牌桶,系统以恒定的速率向令牌桶中放入令牌,如果有客户端来请求,先需要从令牌桶中拿一个令牌,拿到令牌,才有资格访问系统,这时令牌桶中少一个令牌。当令牌桶满的时候,再向令牌桶生成令牌时,令牌会被抛弃。

​ 在令牌桶算法中,存在以下几种情况:

  1. 请求速度大于令牌的生成速度:那么令牌桶中的令牌会被取完,后续再进来的请求,由于拿不到令牌,会被限流。

  2. 请求速度等于令牌的生成速度:那么此时系统处于平稳状态。

  3. 请求速度小于令牌的生成速度:那么此时系统的访问量远远低于系统的并发能力,请求可以被正常处理。

    令牌桶算法,由于有一个桶的存在,可以处理短时间大流量的场景。这是令牌桶和漏桶的一个区别。

你设计微服务时遵循什么原则?

  1. 单一职责原则:让每个服务能独立,有界限的工作,每个服务只关注自己的业务。做到高内聚。
  2. 服务自治原则:每个服务要能做到独立开发、独立测试、独立构建、独立部署,独立运行。与其他服务进行解耦。
  3. 轻量级通信原则:让每个服务之间的调用是轻量级,并且能够跨平台、跨语言。比如采用RESTful风格,利用消息队列进行通信等。
  4. 粒度进化原则:对每个服务的粒度把控,其实没有统一的标准,这个得结合我们解决的具体业务问题。不要过度设计。服务的粒度随着业务和用户的发展而发展。

​ 总结一句话,软件是为业务服务的,好的系统不是设计出来的,而是进化出来的。

CAP定理是什么?

​ CAP定理,又叫布鲁尔定理。指的是:在一个分布式系统中,最多只能同时满足一致性(Consistency)、可用性(Availability)和分区容错性(Partition tolerance)这三项中的两项。

  • C:一致性(Consistency),数据在多个副本中保持一致,可以理解成两个用户访问两个系统A和B,当A系统数据有变化时,及时同步给B系统,让两个用户看到的数据是一致的。

  • A:可用性(Availability),系统对外提供服务必须一直处于可用状态,在任何故障下,客户端都能在合理时间内获得服务端非错误的响应。

  • P:分区容错性(Partition tolerance),在分布式系统中遇到任何网络分区故障,系统仍然能对外提供服务。网络分区,可以这样理解,在分布式系统中,不同的节点分布在不同的子网络中,有可能子网络中只有一个节点,在所有网络正常的情况下,由于某些原因导致这些子节点之间的网络出现故障,导致整个节点环境被切分成了不同的独立区域,这就是网络分区。

    我们来详细分析一下CAP,为什么只能满足两个。看下图所示:

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-OReNtnqZ-1662355491071)(images/10-4 CAP演示-1617721637028.jpg)]

    ​ 用户1和用户2分别访问系统A和系统B,系统A和系统B通过网络进行同步数据。理想情况是:用户1访问系统A对数据进行修改,将data1改成了data2,同时用户2访问系统B,拿到的是data2数据。

    ​ 但是实际中,由于分布式系统具有八大谬论:

    • 网络相当可靠

    • 延迟为零

    • 传输带宽是无限的

    • 网络相当安全

    • 拓扑结构不会改变

    • 必须要有一名管理员

    • 传输成本为零

    • 网络同质化

    我们知道,只要有网络调用,网络总是不可靠的。我们来一一分析。

    1. 当网络发生故障时,系统A和系统B没法进行数据同步,也就是我们不满足P,同时两个系统依然可以访问,那么此时其实相当于是单机系统,就不是分布式系统了,所以既然我们是分布式系统,P必须满足。
    2. 当P满足时,如果用户1通过系统A对数据进行了修改将data1改成了data2,也要让用户2通过系统B正确的拿到data2,那么此时是满足C,就必须等待网络将系统A和系统B的数据同步好,并且在同步期间,任何人不能访问系统B(让系统不可用),否则数据就不是一致的。此时满足的是CP。
    3. 当P满足时,如果用户1通过系统A对数据进行了修改将data1改成了data2,也要让系统B能继续提供服务,那么此时,只能接受系统A没有将data2同步给系统B(牺牲了一致性)。此时满足的就是AP。

​ 我们在前面学过的注册中心Eureka就是满足 的AP,它并不保证C。而Zookeeper是保证CP,它不保证A。在生产中,A和C的选择,没有正确的答案,是取决于自己的业务的。比如12306,是满足CP,因为买票必须满足数据的一致性,不然一个座位多卖了,对铁路运输都是不可以接受的。

BASE理论是什么?

由于CAP中一致性C和可用性A无法兼得,eBay的架构师,提出了BASE理论,它是通过牺牲数据的强一致性,来获得可用性。它由于如下3种特征:

  • Basically Available(基本可用):分布式系统在出现不可预知故障的时候,允许损失部分可用性,保证核心功能的可用。

  • Soft state(软状态):软状态也称为弱状态,和硬状态相对,是指允许系统中的数据存在中间状态,并认为该中间状态的存在不会影响系统的整体可用性,即允许系统在不同节点的数据副本之间进行数据同步的过程存在延时。、

  • Eventually consistent(最终一致性):最终一致性强调的是系统中所有的数据副本,在经过一段时间的同步后,最终能够达到一个一致的状态。因此,最终一致性的本质是需要系统保证最终数据能够达到一致,而不需要实时保证系统数据的强一致性。

​ BASE理论并没有要求数据的强一致性,而是允许数据在一定的时间段内是不一致的,但在最终某个状态会达到一致。在生产环境中,很多公司,会采用BASE理论来实现数据的一致,因为产品的可用性相比强一致性来说,更加重要。比如在电商平台中,当用户对一个订单发起支付时,往往会调用第三方支付平台,比如支付宝支付或者微信支付,调用第三方成功后,第三方并不能及时通知我方系统,在第三方没有通知我方系统的这段时间内,我们给用户的订单状态显示支付中,等到第三方回调之后,我们再将状态改成已支付。虽然订单状态在短期内存在不一致,但是用户却获得了更好的产品体验。

2PC提交协议是什么?

二阶段提交(Two-phaseCommit)是指,在计算机网络以及数据库领域内,为了使基于分布式系统架构下的所有节点在进行事务提交时保持一致性而设计的一种算法(Algorithm)。通常,二阶段提交也被称为是一种协议(Protocol))。在分布式系统中,每个节点虽然可以知晓自己的操作时成功或者失败,却无法知道其他节点的操作的成功或失败。当一个事务跨越多个节点时,为了保持事务的ACID特性,需要引入一个作为协调者的组件来统一掌控所有节点(称作参与者)的操作结果并最终指示这些节点是否要把操作结果进行真正的提交(比如将更新后的数据写入磁盘等等)。因此,二阶段提交的算法思路可以概括为:参与者将操作成败通知协调者,再由协调者根据所有参与者的反馈情报决定各参与者是否要提交操作还是中止操作。

所谓的两个阶段是指:第一阶段:**准备阶段(投票阶段)**和第二阶段:提交阶段(执行阶段)

准备阶段

事务协调者(事务管理器)给每个参与者(资源管理器)发送Prepare消息,每个参与者要么直接返回失败(如权限验证失败),要么在本地执行事务,写本地的redo和undo日志,但不提交,到达一种“万事俱备,只欠东风”的状态。

可以进一步将准备阶段分为以下三个步骤:

1)协调者节点向所有参与者节点询问是否可以执行提交操作(vote),并开始等待各参与者节点的响应。

2)参与者节点执行询问发起为止的所有事务操作,并将Undo信息和Redo信息写入日志。(注意:若成功这里其实每个参与者已经执行了事务操作)

3)各参与者节点响应协调者节点发起的询问。如果参与者节点的事务操作实际执行成功,则它返回一个”同意”消息;如果参与者节点的事务操作实际执行失败,则它返回一个”中止”消息。

提交阶段

如果协调者收到了参与者的失败消息或者超时,直接给每个参与者发送回滚(Rollback)消息;否则,发送提交(Commit)消息;参与者根据协调者的指令执行提交或者回滚操作,释放所有事务处理过程中使用的锁资源。(注意:必须在最后阶段释放锁资源)

接下来分两种情况分别讨论提交阶段的过程。

当协调者节点从所有参与者节点获得的相应消息都为”同意”时:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-pYVAeh57-1662355491072)(images/success.png)]

1)协调者节点向所有参与者节点发出”正式提交(commit)”的请求。

2)参与者节点正式完成操作,并释放在整个事务期间内占用的资源。

3)参与者节点向协调者节点发送”完成”消息。

4)协调者节点受到所有参与者节点反馈的”完成”消息后,完成事务。

如果任一参与者节点在第一阶段返回的响应消息为”中止”,或者 协调者节点在第一阶段的询问超时之前无法获取所有参与者节点的响应消息时:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zfYp4thj-1662355491072)(images/fail.png)]

1)协调者节点向所有参与者节点发出”回滚操作(rollback)”的请求。

2)参与者节点利用之前写入的Undo信息执行回滚,并释放在整个事务期间内占用的资源。

3)参与者节点向协调者节点发送”回滚完成”消息。

4)协调者节点受到所有参与者节点反馈的”回滚完成”消息后,取消事务。

不管最后结果如何,第二阶段都会结束当前事务。

2PC提交协议有什么缺点?

  1. 同步阻塞问题。执行过程中,所有参与节点都是事务阻塞型的。当参与者占有公共资源时,其他第三方节点访问公共资源不得不处于阻塞状态。

  2. 单点故障。由于协调者的重要性,一旦协调者发生故障。参与者会一直阻塞下去。尤其在第二阶段,协调者发生故障,那么所有的参与者还都处于锁定事务资源的状态中,而无法继续完成事务操作。(如果是协调者挂掉,可以重新选举一个协调者,但是无法解决因为协调者宕机导致的参与者处于阻塞状态的问题)

  3. 数据不一致。在二阶段提交的阶段二中,当协调者向参与者发送commit请求之后,发生了局部网络异常或者在发送commit请求过程中协调者发生了故障,这回导致只有一部分参与者接受到了commit请求。而在这部分参与者接到commit请求之后就会执行commit操作。但是其他部分未接到commit请求的机器则无法执行事务提交。于是整个分布式系统便出现了数据部一致性的现象。

  4. 二阶段无法解决的问题:协调者再发出commit消息之后宕机,而唯一接收到这条消息的参与者同时也宕机了。那么即使协调者通过选举协议产生了新的协调者,这条事务的状态也是不确定的,没人知道事务是否被已经提交。

3PC提交协议是什么?

CanCommit阶段

3PC的CanCommit阶段其实和2PC的准备阶段很像。协调者向参与者发送commit请求,参与者如果可以提交就返回Yes响应,否则返回No响应。

1.事务询问 协调者向参与者发送CanCommit请求。询问是否可以执行事务提交操作。然后开始等待参与者的响应。

2.响应反馈 参与者接到CanCommit请求之后,正常情况下,如果其自身认为可以顺利执行事务,则返回Yes响应,并进入预备状态。否则反馈No

PreCommit阶段

协调者根据参与者的反应情况来决定是否可以进行事务的PreCommit操作。根据响应情况,有以下两种可能。

假如协调者从所有的参与者获得的反馈都是Yes响应,那么就会执行事务的预执行。

1.发送预提交请求 协调者向参与者发送PreCommit请求,并进入Prepared阶段。

2.事务预提交 参与者接收到PreCommit请求后,会执行事务操作,并将undo和redo信息记录到事务日志中。

3.响应反馈 如果参与者成功的执行了事务操作,则返回ACK响应,同时开始等待最终指令。

假如有任何一个参与者向协调者发送了No响应,或者等待超时之后,协调者都没有接到参与者的响应,那么就执行事务的中断。

1.发送中断请求 协调者向所有参与者发送abort请求。

2.中断事务 参与者收到来自协调者的abort请求之后(或超时之后,仍未收到协调者的请求),执行事务的中断。

pre阶段参与者没收到请求,rollback。

doCommit阶段

该阶段进行真正的事务提交,也可以分为以下两种情况。

执行提交

1.发送提交请求 协调接收到参与者发送的ACK响应,那么他将从预提交状态进入到提交状态。并向所有参与者发送doCommit请求。

2.事务提交 参与者接收到doCommit请求之后,执行正式的事务提交。并在完成事务提交之后释放所有事务资源。

3.响应反馈 事务提交完之后,向协调者发送Ack响应。

4.完成事务 协调者接收到所有参与者的ack响应之后,完成事务。

中断事务 协调者没有接收到参与者发送的ACK响应(可能是接受者发送的不是ACK响应,也可能响应超时),那么就会执行中断事务。

1.发送中断请求 协调者向所有参与者发送abort请求

2.事务回滚 参与者接收到abort请求之后,利用其在阶段二记录的undo信息来执行事务的回滚操作,并在完成回滚之后释放所有的事务资源。

3.反馈结果 参与者完成事务回滚之后,向协调者发送ACK消息

4.中断事务 协调者接收到参与者反馈的ACK消息之后,执行事务的中断。

2PC和3PC的区别是什么?

1.3pc比2pc多了一个can commit阶段,减少了不必要的资源浪费。因为2pc在第一阶段会占用资源,而3pc在这个阶段不占用资源,只是校验一下sql,如果不能执行,就直接返回,减少了资源占用。

2.引入超时机制。同时在协调者和参与者中都引入超时机制。

2pc:只有协调者有超时机制,超时后,发送回滚指令。

3pc:协调者和参与者都有超时机制。

协调者超时: can commit,pre commit中,如果收不到参与者的反馈,则协调者向参与者发送中断指令。
参与者超时: pre commit阶段,参与者进行中断; do commit阶段,参与者进行提交。

  • TCC解决方案是什么?

    ​ TCC(Try-Confirm-Cancel)是一种常用的分布式事务解决方案,它将一个事务拆分成三个步骤:

    • T(Try):业务检查阶段,这阶段主要进行业务校验和检查或者资源预留;也可能是直接进行业务操作。

    • C(Confirm):业务确认阶段,这阶段对Try阶段校验过的业务或者预留的资源进行确认。

    • C(Cancel):业务回滚阶段,这阶段和上面的C(Confirm)是互斥的,用于释放Try阶段预留的资源或者业务。

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-NYnvX9q9-1662355491072)(images/image-20210521230854476-1621753201509.png)]

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-8bLayWMy-1662355491073)(images/image-20210521230904203-1621753201509.png)]

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2ocDVMJK-1662355491073)(images/image-20210521230912365-1621753201509.png)]

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-0p3gNwD5-1662355491073)(images/image-20210521230919795-1621753201509.png)]

TCC空回滚是解决什么问题的?

​ 在没有调用TCC资源Try方法的情况下,调用了二阶段的Cancel方法。比如当Try请求由于网络延迟或故障等原因,没有执行,结果返回了异常,那么此时Cancel就不能正常执行,因为Try没有对数据进行修改,如果Cancel进行了对数据的修改,那就会导致数据不一致。
​ 解决思路是关键就是要识别出这个空回滚。思路很简单就是需要知道Try阶段是否执行,如果执行了,那就是正常回滚;如果没执行,那就是空回滚。建议TM在发起全局事务时生成全局事务记录,全局事务ID贯穿整个分布式事务调用链条。再额外增加一张分支事务记录表,其中有全局事务ID和分支事务ID,第一阶段Try方法里会插入一条记录,表示Try阶段执行了。Cancel接口里读取该记录,如果该记录存在,则正常回滚;如果该记录不存在,则是空回滚。

如何解决TCC幂等问题?

为了保证TCC二阶段提交重试机制不会引发数据不一致,要求TCC的二阶段Confirm和Cancel接口保证幂等,这样不会重复使用或者释放资源。如果幂等控制没有做好,很有可能导致数据不一致等严重问题。
解决思路在上述 分支事务记录中增加执行状态,每次执行前都查询该状态。

分布式锁。

如何解决TCC中悬挂问题?

悬挂就是对于一个分布式事务,其二阶段Cancel接口比Try接口先执行。
出现原因是在调用分支事务Try时,由于网络发生拥堵,造成了超时,TM就会通知RM回滚该分布式事务,可能回滚完成后,Try请求才到达参与者真正执行,而一个Try方法预留的业务资源,只有该分布式事务才能使用,该分布式事务第一阶段预留的业务资源就再也没有人能够处理了,对于这种情况,我们就称为悬挂,即业务资源预留后无法继续处理。
解决思路是如果二阶段执行完成,那一阶段就不能再继续执行。在执行一阶段事务时判断在该全局事务下,判断分支事务记录表中是否已经有二阶段事务记录,如果有则不执行Try。

可靠消息服务方案是什么?

​ 可靠消息最终一致性方案指的是:当事务的发起方(事务参与者,消息发送者)执行完本地事务后,同时发出一条消息,事务参与方(事务参与者,消息的消费者)一定能够接受消息并可以成功处理自己的事务。

​ 这里面强调两点:

  1. 可靠消息:发起方一定得把消息传递到消费者。
  2. 最终一致性:最终发起方的业务处理和消费方的业务处理得完成,达成最终一致。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-6wlvNwy5-1662355491073)(images/image-20210522125830646.png)]

最大努力通知方案的关键是什么?

  1. 有一定的消息重复通知机制。因为接收通知方(上图中的我方支付系统)可能没有接收到通知,此时要有一定的机制对消息重复通知。
  2. 消息校对机制。如果尽最大努力也没有通知到接收方,或者接收方消费消息后要再次消费,此时可由接收方主动向通知方查询消息信息来满足需求。

什么是分布式系统中的幂等?

幂等(idempotent、idempotence)是一个数学与计算机学概念,常见于抽象代数中。

在编程中,一个幂等操作的特点是其任意多次执行所产生的影响均与一次执行的影响相同。幂等函数,或幂等方法,是指可以使用相同参数重复执行,并能获得相同结果的函数。这些函数不会影响系统状态,也不用担心重复执行会对系统造成改变。

例如,“getUsername()和 setTrue()”函数就是一个幂等函数. 更复杂的操作幂等保证是利用唯一交易号(流水号)实现. 我的理解:幂等就是一个操作,不论执行多少次,产生的效果和返回的结果都是一样的。

操作:查询,set固定值。逻辑删除。set 固定值。

流程:分布式系统中,网络调用,重试机制。

幂等有哪些技术解决方案?

1.查询操作

查询一次和查询多次,在数据不变的情况下,查询结果是一样的。select 是天然的幂等操作;

2.删除操作

删除操作也是幂等的,删除一次和多次删除都是把数据删除。(注意可能返回结果不一样,删除的数据不存在,返回 0,删除的数据多条,返回结果多个。

3.唯一索引

防止新增脏数据。比如:支付宝的资金账户,支付宝也有用户账户,每个用户只能有一个资金账户,怎么防止给用户创建多个资金账户,那么给资金账户表中的用户 ID 加唯一索引,所以一个用户新增成功一个资金账户记录。要点:唯一索引或唯一组合索引来防止新增数据存在脏数据(当表存在唯一索引,并发时新增报错时,再查询一次就可以了,数据应该已经存在了,返回结果即可。

4.token 机制

防止页面重复提交。

**业务要求:**页面的数据只能被点击提交一次;

**发生原因:**由于重复点击或者网络重发,或者 nginx 重发等情况会导致数据被重复提交;

**解决办法:**集群环境采用 token 加 redis(redis 单线程的,处理需要排队);单 JVM 环境:采用 token 加 redis 或 token 加 jvm 锁。

处理流程:

  1. 数据提交前要向服务的申请 token,token 放到 redis 或 jvm 内存,token 有效时间;
  2. 提交后后台校验 token,同时删除 token,生成新的 token 返回。

**token 特点:**要申请,一次有效性,可以限流。

注意:redis 要用删除操作来判断 token,删除成功代表 token 校验通过。

  1. traceId

    操作时唯一的。

对外提供的API如何保证幂等?

举例说明: 银联提供的付款接口:需要接入商户提交付款请求时附带:source 来源,seq 序列号。

source+seq 在数据库里面做唯一索引,防止多次付款(并发时,只能处理一个请求) 。重点:对外提供接口为了支持幂等调用,接口有两个字段必须传,一个是来源 source,一个是来源方序列号 seq,这个两个字段在提供方系统里面做联合唯一索引,这样当第三方调用时,先在本方系统里面查询一下,是否已经处理过,返回相应处理结果;没有处理过,进行相应处理,返回结果。

注意,为了幂等友好,一定要先查询一下,是否处理过该笔业务,不查询直接插入业务系统,会报错,但实际已经处理。

双写一致性问题如何解决?

先做一个说明,从理论上来说,给缓存设置过期时间,是保证最终一致性的解决方案。这种方案下,我们可以对存入缓存的数据设置过期时间,所有的写操作以数据库为准,对缓存操作只是尽最大努力更新即可。也就是说如果数据库写成功,缓存更新失败,那么只要到达过期时间,则后面的读请求自然会从数据库中读取新值然后回填缓存。因此,接下来讨论的思路不依赖于给缓存设置过期时间这个方案。
在这里,我们讨论三种更新策略:

  1. 先更新缓存,再更新数据库。(不可取)
  2. 先更新数据库,再更新缓存。(不可取)
  3. 先删除缓存,再更新数据库。(不可取)
  4. 先更新数据库,再删除缓存。(可取,有问题待解决)

大前提:

先读缓存,如果缓存没有,才从数据库读取。

(1)先更新数据库,再更新缓存

这套方案,大家是普遍反对的。为什么呢?有如下两点原因。
原因一(线程安全角度)
同时有请求A和请求B进行更新操作,那么会出现
(1)线程A更新了数据库
(2)线程B更新了数据库
(3)线程B更新了缓存
(4)线程A更新了缓存
这就出现请求A更新缓存应该比请求B更新缓存早才对,但是因为网络等原因,B却比A更早更新了缓存。这就导致了脏数据,因此不考虑。
原因二(业务场景角度)
有如下两点:
(1)如果你是一个写数据库场景比较多,而读数据场景比较少的业务需求,采用这种方案就会导致,数据压根还没读到,缓存就被频繁的更新,浪费性能。
(2)如果你写入数据库的值,并不是直接写入缓存的,而是要经过一系列复杂的计算再写入缓存。那么,每次写入数据库后,都再次计算写入缓存的值,无疑是浪费性能的。显然,删除缓存更为适合。

接下来讨论的就是争议最大的,先删缓存,再更新数据库。还是先更新数据库,再删缓存的问题。

(2)先删缓存,再更新数据库

该方案会导致不一致的原因是。同时有一个请求A进行更新操作,另一个请求B进行查询操作。那么会出现如下情形:
(1)请求A进行写操作,删除缓存
(2)请求B查询发现缓存不存在
(3)请求B去数据库查询得到旧值
(4)请求B将旧值写入缓存
(5)请求A将新值写入数据库
上述情况就会导致不一致的情形出现。而且,如果不采用给缓存设置过期时间策略,该数据永远都是脏数据。
那么,如何解决呢?采用延时双删策略

(1)先淘汰缓存
(2)再写数据库(这两步和原来一样)
(3)休眠1秒,再次淘汰缓存
这么做,可以将1秒内所造成的缓存脏数据,再次删除。
那么,这个1秒怎么确定的,具体该休眠多久呢?
针对上面的情形,读者应该自行评估自己的项目的读数据业务逻辑的耗时。然后写数据的休眠时间则在读数据业务逻辑的耗时基础上,加几百ms即可。这么做的目的,就是确保读请求结束,写请求可以删除读请求造成的缓存脏数据。
如果你用了mysql的读写分离架构怎么办?
ok,在这种情况下,造成数据不一致的原因如下,还是两个请求,一个请求A进行更新操作,另一个请求B进行查询操作。
(1)请求A进行写操作,删除缓存
(2)请求A将数据写入数据库了,
(3)请求B查询缓存发现,缓存没有值
(4)请求B去从库查询,这时,还没有完成主从同步,因此查询到的是旧值
(5)请求B将旧值写入缓存
(6)数据库完成主从同步,从库变为新值
上述情形,就是数据不一致的原因。还是使用双删延时策略。只是,睡眠时间修改为在主从同步的延时时间基础上,加几百ms。
采用这种同步淘汰策略,吞吐量降低怎么办?
ok,那就将第二次删除作为异步的。自己起一个线程,异步删除。这样,写的请求就不用沉睡一段时间后了,再返回。这么做,加大吞吐量。
第二次删除,如果删除失败怎么办?
这是个非常好的问题,因为第二次删除失败,就会出现如下情形。还是有两个请求,一个请求A进行更新操作,另一个请求B进行查询操作,为了方便,假设是单库:
(1)请求A进行写操作,删除缓存
(2)请求B查询发现缓存不存在
(3)请求B去数据库查询得到旧值
(4)请求B将旧值写入缓存
(5)请求A将新值写入数据库
(6)请求A试图去删除,请求B写入对的缓存值,结果失败了。
ok,这也就是说。如果第二次删除缓存失败,会再次出现缓存和数据库不一致的问题。
如何解决呢?

(3)先更新数据库,再删缓存

首先,先说一下。老外提出了一个缓存更新套路,名为《Cache-Aside pattern》。其中就指出

  • 失效:应用程序先从cache取数据,没有得到,则从数据库中取数据,成功后,放到缓存中。
  • 命中:应用程序从cache中取数据,取到后返回。
  • 更新:先把数据存到数据库中,成功后,再让缓存失效。

另外,知名社交网站facebook也在论文《Scaling Memcache at Facebook》中提出,他们用的也是先更新数据库,再删缓存的策略。
这种情况不存在并发问题么?
不是的。假设这会有两个请求,一个请求A做查询操作,一个请求B做更新操作,那么会有如下情形产生
(1)缓存刚好失效
(2)请求A查询数据库,得一个旧值
(3)请求B将新值写入数据库
(4)请求B删除缓存
(5)请求A将查到的旧值写入缓存
ok,如果发生上述情况,确实是会发生脏数据。
然而,发生这种情况的概率又有多少呢?
发生上述情况有一个先天性条件,就是步骤(3)的写数据库操作比步骤(2)的读数据库操作耗时更短,才有可能使得步骤(4)先于步骤(5)。可是,大家想想,数据库的读操作的速度远快于写操作的(不然做读写分离干嘛,做读写分离的意义就是因为读操作比较快,耗资源少),因此步骤(3)耗时比步骤(2)更短,这一情形很难出现。
假设,有人非要抬杠,有强迫症,一定要解决怎么办?
如何解决上述并发问题?
首先,给缓存设有效时间是一种方案。其次,采用策略(2)里给出的异步延时删除策略,保证读请求完成以后,再进行删除操作。
还有其他造成不一致的原因么?
有的,这也是缓存更新策略(2)和缓存更新策略(3)都存在的一个问题,如果删缓存失败了怎么办,那不是会有不一致的情况出现么。比如一个写数据请求,然后写入数据库了,删缓存失败了,这会就出现不一致的情况了。这也是缓存更新策略(2)里留下的最后一个疑问。
如何解决?
提供一个保障的重试机制即可,这里给出两套方案。
方案一
如下图所示
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-hovIyfTr-1662355491074)(images/o_update1.png)]
流程如下所示
(1)更新数据库数据;
(2)缓存因为种种问题删除失败
(3)将需要删除的key发送至消息队列
(4)自己消费消息,获得需要删除的key
(5)继续重试删除操作,直到成功
然而,该方案有一个缺点,对业务线代码造成大量的侵入。于是有了方案二,在方案二中,启动一个订阅程序去订阅数据库的binlog,获得需要操作的数据。在应用程序中,另起一段程序,获得这个订阅程序传来的信息,进行删除缓存操作。
方案二
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VLHdWIBE-1662355491074)(images/o_update2.png)]
流程如下图所示:
(1)更新数据库数据
(2)数据库会将操作信息写入binlog日志当中
(3)订阅程序提取出所需要的数据以及key
(4)另起一段非业务代码,获得该信息
(5)尝试删除缓存操作,发现删除失败
(6)将这些信息发送至消息队列
(7)重新从消息队列中获得该数据,重试操作。

**备注说明:**上述的订阅binlog程序在mysql中有现成的中间件叫canal,可以完成订阅binlog日志的功能。至于oracle中,博主目前不知道有没有现成中间件可以使用。另外,重试机制,博主是采用的是消息队列的方式。如果对一致性要求不是很高,直接在程序中另起一个线程,每隔一段时间去重试即可,这些大家可以灵活自由发挥,只是提供一个思路。

分布式微服务项目你是如何设计的?

我一般设计成两层:业务层和能力层(中台),业务层接受用户请求,然后通过调用能力层来完成业务逻辑。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-O8qjlqVb-1662355491074)(images/image-20210522172654370.png)]

认证 (Authentication) 和授权 (Authorization)的区别是什么?

Authentication(认证) 是验证您的身份的凭据(例如用户名/用户ID和密码),通过这个凭据,系统得以知道你就是你,也就是说系统存在你这个用户。所以,Authentication 被称为身份/用户验证。
Authorization(授权) 发生在 Authentication(认证) 之后。授权,它主要掌管我们访问系统的权限。比如有些特定资源只能具有特定权限的人才能访问比如admin,有些对系统资源操作比如删除、添加、更新只能特定人才具有。
这两个一般在我们的系统中被结合在一起使用,目的就是为了保护我们系统的安全性。

Cookie 和 Session 有什么区别?如何使用Session进行身份验证?

Session 的主要作用就是通过服务端记录用户的状态。 典型的场景是购物车,当你要添加商品到购物车的时候,系统不知道是哪个用户操作的,因为 HTTP 协议是无状态的。服务端给特定的用户创建特定的 Session 之后就可以标识这个用户并且跟踪这个用户了。

Cookie 数据保存在客户端(浏览器端),Session 数据保存在服务器端。相对来说 Session 安全性更高。如果使用 Cookie 的一些敏感信息不要写入 Cookie 中,最好能将 Cookie 信息加密然后使用到的时候再去服务器端解密。

那么,如何使用Session进行身份验证?

很多时候我们都是通过 SessionID 来实现特定的用户,SessionID 一般会选择存放在 Redis 中。举个例子:用户成功登陆系统,然后返回给客户端具有 SessionID 的 Cookie,当用户向后端发起请求的时候会把 SessionID 带上,这样后端就知道你的身份状态了。关于这种认证方式更详细的过程如下:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vbeLwBMN-1662355491074)(images/image-20210520130119426.png)]

用户向服务器发送用户名和密码用于登陆系统。
服务器验证通过后,服务器为用户创建一个 Session,并将 Session信息存储 起来。
服务器向用户返回一个 SessionID,写入用户的 Cookie。
当用户保持登录状态时,Cookie 将与每个后续请求一起被发送出去。
服务器可以将存储在 Cookie 上的 Session ID 与存储在内存中或者数据库中的 Session 信息进行比较,以验证用户的身份,返回给用户客户端响应信息的时候会附带用户当前的状态。
使用 Session 的时候需要注意下面几个点:

依赖Session的关键业务一定要确保客户端开启了Cookie。
注意Session的过期时间

为什么Cookie 无法防止CSRF攻击,而token可以?

**CSRF(Cross Site Request Forgery)**一般被翻译为 跨站请求伪造 。那么什么是 跨站请求伪造 呢?说简单用你的身份去发送一些对你不友好的请求。举个简单的例子:

小壮登录了某网上银行,他来到了网上银行的帖子区,看到一个帖子下面有一个链接写着“科学理财,年盈利率过万”,小壮好奇的点开了这个链接,结果发现自己的账户少了10000元。这是这么回事呢?原来黑客在链接中藏了一个请求,这个请求直接利用小壮的身份给银行发送了一个转账请求,也就是通过你的 Cookie 向银行发出请求。

<a src=http://www.mybank.com/Transfer?bankId=11&money=10000>科学理财,年盈利率过万</>
进行Session 认证的时候,我们一般使用 Cookie 来存储 SessionId,当我们登陆后后端生成一个SessionId放在Cookie中返回给客户端,服务端通过Redis或者其他存储工具记录保存着这个Sessionid,客户端登录以后每次请求都会带上这个SessionId,服务端通过这个SessionId来标示你这个人。如果别人通过 cookie拿到了 SessionId 后就可以代替你的身份访问系统了。

Session 认证中 Cookie 中的 SessionId是由浏览器发送到服务端的,借助这个特性,攻击者就可以通过让用户误点攻击链接,达到攻击效果。

但是,我们使用 token 的话就不会存在这个问题,在我们登录成功获得 token 之后,一般会选择存放在 local storage 中。然后我们在前端通过某些方式会给每个发到后端的请求加上这个 token,这样就不会出现 CSRF 漏洞的问题。因为,即使有个你点击了非法链接发送了请求到服务端,这个非法请求是不会携带 token 的,所以这个请求将是非法的。

什么是 Token?什么是 JWT?如何基于Token进行身份验证?

我们知道 Session 信息需要保存一份在服务器端。这种方式会带来一些麻烦,比如需要我们保证保存 Session 信息服务器的可用性、不适合移动端(依赖Cookie)等等。

有没有一种不需要自己存放 Session 信息就能实现身份验证的方式呢?使用 Token 即可!JWT (JSON Web Token) 就是这种方式的实现,通过这种方式服务器端就不需要保存 Session 数据了,只用在客户端保存服务端返回给客户的 Token 就可以了,扩展性得到提升。

JWT 本质上就一段签名的 JSON 格式的数据。由于它是带有签名的,因此接收者便可以验证它的真实性。

下面是 RFC 7519 对 JWT 做的较为正式的定义。

JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted. ——JSON Web Token (JWT)

JWT 由 3 部分构成:

Header :描述 JWT 的元数据。定义了生成签名的算法以及 Token 的类型。
Payload(负载):用来存放实际需要传递的数据
Signature(签名):服务器通过Payload、Header和一个密钥(secret)使用 Header 里面指定的签名算法(默认是 HMAC SHA256)生成。
在基于 Token 进行身份验证的的应用程序中,服务器通过Payload、Header和一个密钥(secret)创建令牌(Token)并将 Token 发送给客户端,客户端将 Token 保存在 Cookie 或者 localStorage 里面,以后客户端发出的所有请求都会携带这个令牌。你可以把它放在 Cookie 里面自动发送,但是这样不能跨域,所以更好的做法是放在 HTTP Header 的 Authorization字段中:Authorization: Bearer Token。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-mRsu0aSm-1662355491075)(images/image-20210520130410868.png)]

用户向服务器发送用户名和密码用于登陆系统。
身份验证服务响应并返回了签名的 JWT,上面包含了用户是谁的内容。
用户以后每次向后端发请求都在Header中带上 JWT。
服务端检查 JWT 并从中获取用户相关信息。

分布式架构下,Session 共享有什么方案?

  1. 不要有session:但是确实在某些场景下,是可以没有session的,其实在很多接口类系统当中,都提倡【API无状态服务】;也就是每一次的接口访问,都不依赖于session、不依赖于前一次的接口访问;
  2. 存入cookie中:将session存储到cookie中,但是缺点也很明显,例如每次请求都得带着session,数据存储在客户端本地,是有风险的;
  3. session同步:对个服务器之间同步session,这样可以保证每个服务器上都有全部的session信息,不过当服务器数量比较多的时候,同步是会有延迟甚至同步失败;
  4. 使用Nginx(或其他复杂均衡软硬件)中的ip绑定策略,同一个ip只能在指定的同一个机器访问,但是这样做风险也比较大,而且也是去了负载均衡的意义;
  5. 我们现在的系统会把session放到Redis中存储,虽然架构上变得复杂,并且需要多访问一次Redis,但是这种方案带来的好处也是很大的:实现session共享,可以水平扩展(增加Redis服务器),服务器重启session不丢失(不过也要注意session在Redis中的刷新/失效机制),不仅可以跨服务器session共享,甚至可以跨平台(例如网页端和APP端)。

springcloud核心组件有哪些?

服务注册与发现——Netflix Eureka、Nacos、Zookeeper

客户端负载均衡——Netflix Ribbon、SpringCloud LoadBalancer

服务熔断器——Netflix Hystrix、Alibaba Sentinel、Resilience4J

服务网关——Netflix Zuul、SpringCloud Gateway

服务接口调用——Netflix Feign、 Resttemplate、Openfeign

链路追踪——Netflix Sleuth、Skywalking、Pinpoint

聚合Hystrix监控数据——Netflix Turbine

监控中心---- SpringBoot Admin

配置中心——Spring Cloud Config 、Apollo、nacos

微服务架构原理是什么?

主要是面向SOA理念,更细小粒度服务的拆分,将功能分解到各个服务当中,从而降低系统的耦合性,并提供更加灵活的服务支持。

注册中心的原理是什么?

服务启动后向Eureka注册,Eureka Server会将注册信息向其他Eureka Server进行同步,当服务消费者要调用服务提供者,则向服务注册中心获取服务提供者地址,然后会将服务提供者地址缓存在本地,下次再调用时,则直接从本地缓存中取,完成一次调用

配置中心的原理是什么?

在服务运行之前,将所需的配置信息从配置仓库拉取到本地服务,达到统一化配置管理的目的

配置中心是如何实现自动刷新的?

  1. 配置中心Server端承担起配置刷新的职责

  2. 提交配置触发post请求给server端的bus/refresh接口

  3. server端接收到请求并发送给Spring Cloud Bus总线

  4. Spring Cloud bus接到消息并通知给其它连接到总线的客户端

  5. 其它客户端接收到通知,请求Server端获取最新配置

  6. 全部客户端均获取到最新的配置

配置中心是如何保证数据安全的?

1.保证容器文件访问的安全性,即保证所有的网络资源请求都需要登录

2.将配置中心里所有配置文件中的密码进行加密,保证其密文性

3.开发环境禁止拉取生产环境的配置文件

用zookeeper和eureka做注册中心有什么区别?

Zookeeper保证的是CP(一致性,容错性), 而Eureka则是AP(可用性,容错性)。

Spring Cloud和Dubbo有哪些区别?

  1. dubbo 是二进制传输,对象直接转成二进制,使用RPC通信。

SpringCloud是http 传输,同时使用http协议一般会使用JSON报文,json再转二进制,消耗会更大。

  1. Dubbo只是实现了服务治理,而Spring Cloud下面有几十个子项目分别覆盖了微服务架构下的方方面面,服务治理只是其中的一个方面,一定程度来说,Dubbo只是Spring Cloud Netflix中的一个子集。

Ribbon负载均衡原理是什么?

  1. Ribbon通过ILoadBalancer接口对外提供统一的选择服务器(Server)的功能,此接口会根据不同的负载均衡策略(IRule)选择合适的Server返回给使用者。

  2. IRule是负载均衡策略的抽象,ILoadBalancer通过调用IRule的choose()方法返回Server

  3. IPing用来检测Server是否可用,ILoadBalancer的实现类维护一个Timer每隔10s检测一次Server的可用状态

  4. IClientConfig主要定义了用于初始化各种客户端和负载均衡器的配置信息,器实现类为DefaultClientConfigImpl

微服务熔断降级机制是什么?

微服务框架是许多服务互相调用的,要是不做任何保护的话,某一个服务挂了,就会引起连锁反应,导致别的服务也挂。Hystrix 是隔离、熔断以及降级的一个框架。如果调用某服务报错(或者挂了),就对该服务熔断,在 5 分钟内请求此服务直接就返回一个默认值,不需要每次都卡几秒,这个过程,就是所谓的熔断。但是熔断了之后就会少调用一个服务,此时需要做下标记,标记本来需要做什么业务,但是因为服务挂了,暂时没有做,等该服务恢复了,就可以手工处理这些业务。这个过程,就是所谓的降级。

什么是Hystrix?实现原理是什么?

Hystrix是一个延迟和容错库,旨在隔离对远程系统、服务和第三方库的访问点,停止级联故障,并在 不可避免发生故障的复杂分布式系统中实现快速恢复。主要靠Spring的AOP实现

实现原理

正常情况下,断路器关闭,服务消费者正常请求微服务

一段事件内,失败率达到一定阈值,断路器将断开,此时不再请求服务提供者,而是只是快速失败的方法(断路方法)

断路器打开一段时间,自动进入“半开”状态,此时,断路器可允许一个请求方法服务提供者,如果请求调用成功,则关闭断路器,否则继续保持断路器打开状态。

断路器hystrix是保证了局部发生的错误,不会扩展到整个系统,从而保证系统的即使出现局部问题也不会造成系统雪崩

注册中心挂了,或者服务挂了,应该如何处理?

注册中心挂了,可以读取本地持久化里的配置

服务挂了 应该配有服务监控中心 感知到服务下线后可以通过配置的邮件通知相关人员排查问题。

说说你对RPC、RMI如何理解?

RPC 远程过程调用协议,通过网络从远程计算机上请求调用某种服务。

RMI:远程方法调用 能够让在客户端Java虚拟机上的对象像调用本地对象一样调用服务端java 虚拟机中的对象上的方法。

redis持久化机制:RDB和AOF

Redis 持久化

Redis 提供了不同级别的持久化方式:

  • RDB持久化方式能够在指定的时间间隔能对你的数据进行快照存储.

  • AOF持久化方式记录每次对服务器写的操作,当服务器重启的时候会重新执行这些命令来恢复原始的数据,AOF命令以redis协议追加保存每次写的操作到文件末尾.Redis还能对AOF文件进行后台重写,使得AOF文件的体积不至于过大.

  • 如果你只希望你的数据在服务器运行的时候存在,你也可以不使用任何持久化方式.

  • 你也可以同时开启两种持久化方式, 在这种情况下, 当redis重启的时候会优先载入AOF文件来恢复原始的数据,因为在通常情况下AOF文件保存的数据集要比RDB文件保存的数据集要完整.

  • 最重要的事情是了解RDB和AOF持久化方式的不同,让我们以RDB持久化方式开始:

    RDB的优点

  • RDB是一个非常紧凑的文件,它保存了某个时间点得数据集,非常适用于数据集的备份,比如你可以在每个小时报保存一下过去24小时内的数据,同时每天保存过去30天的数据,这样即使出了问题你也可以根据需求恢复到不同版本的数据集.

  • RDB是一个紧凑的单一文件,很方便传送到另一个远端数据中心或者亚马逊的S3(可能加密),非常适用于灾难恢复.

  • RDB在保存RDB文件时父进程唯一需要做的就是fork出一个子进程,接下来的工作全部由子进程来做,父进程不需要再做其他IO操作,所以RDB持久化方式可以最大化redis的性能.

  • 与AOF相比,在恢复大的数据集的时候,RDB方式会更快一些.

    RDB的缺点

  • 如果你希望在redis意外停止工作(例如电源中断)的情况下丢失的数据最少的话,那么RDB不适合你.虽然你可以配置不同的save时间点(例如每隔5分钟并且对数据集有100个写的操作),是Redis要完整的保存整个数据集是一个比较繁重的工作,你通常会每隔5分钟或者更久做一次完整的保存,万一在Redis意外宕机,你可能会丢失几分钟的数据.

  • RDB 需要经常fork子进程来保存数据集到硬盘上,当数据集比较大的时候,fork的过程是非常耗时的,可能会导致Redis在一些毫秒级内不能响应客户端的请求.如果数据集巨大并且CPU性能不是很好的情况下,这种情况会持续1秒,AOF也需要fork,但是你可以调节重写日志文件的频率来提高数据集的耐久度.

    AOF 优点

  • 使用AOF 会让你的Redis更加耐久: 你可以使用不同的fsync策略:无fsync,每秒fsync,每次写的时候fsync.使用默认的每秒fsync策略,Redis的性能依然很好(fsync是由后台线程进行处理的,主线程会尽力处理客户端请求),一旦出现故障,你最多丢失1秒的数据.

  • AOF文件是一个只进行追加的日志文件,所以不需要写入seek,即使由于某些原因(磁盘空间已满,写的过程中宕机等等)未执行完整的写入命令,你也也可使用redis-check-aof工具修复这些问题.

  • Redis 可以在 AOF 文件体积变得过大时,自动地在后台对 AOF 进行重写: 重写后的新 AOF 文件包含了恢复当前数据集所需的最小命令集合。 整个重写操作是绝对安全的,因为 Redis 在创建新 AOF 文件的过程中,会继续将命令追加到现有的 AOF 文件里面,即使重写过程中发生停机,现有的 AOF 文件也不会丢失。 而一旦新 AOF 文件创建完毕,Redis 就会从旧 AOF 文件切换到新 AOF 文件,并开始对新 AOF 文件进行追加操作。

  • AOF 文件有序地保存了对数据库执行的所有写入操作, 这些写入操作以 Redis 协议的格式保存, 因此 AOF 文件的内容非常容易被人读懂, 对文件进行分析(parse)也很轻松。 导出(export) AOF 文件也非常简单: 举个例子, 如果你不小心执行了 FLUSHALL 命令, 但只要 AOF 文件未被重写, 那么只要停止服务器, 移除 AOF 文件末尾的 FLUSHALL 命令, 并重启 Redis , 就可以将数据集恢复到 FLUSHALL 执行之前的状态。

    AOF 缺点

  • 对于相同的数据集来说,AOF 文件的体积通常要大于 RDB 文件的体积。

  • 根据所使用的 fsync 策略,AOF 的速度可能会慢于 RDB 。 在一般情况下, 每秒 fsync 的性能依然非常高, 而关闭 fsync 可以让 AOF 的速度和 RDB 一样快, 即使在高负荷之下也是如此。 不过在处理巨大的写入载入时,RDB 可以提供更有保证的最大延迟时间(latency)。

    4.X版本的整合策略

    在AOF重写策略上做了优化

    在重写AOF文件时,4.x版本以前是把内存数据集的操作指令落地,而新版本是把内存的数据集以rdb的形式落地

    这样重写后的AOF依然追加的是日志,但是,在恢复的时候是先rdb再增量的日志,性能更优秀

扩展知识

异步线程知识点

计算机组成原理

fork

copy on write

系统IO

pagecache

fsync


redis的过期键有哪些删除策略

过期精度

在 Redis 2.4 及以前版本,过期期时间可能不是十分准确,有0-1秒的误差。

从 Redis 2.6 起,过期时间误差缩小到0-1毫秒。

过期和持久

Keys的过期时间使用Unix时间戳存储(从Redis 2.6开始以毫秒为单位)。这意味着即使Redis实例不可用,时间也是一直在流逝的。

要想过期的工作处理好,计算机必须采用稳定的时间。 如果你将RDB文件在两台时钟不同步的电脑间同步,有趣的事会发生(所有的 keys装载时就会过期)。

即使正在运行的实例也会检查计算机的时钟,例如如果你设置了一个key的有效期是1000秒,然后设置你的计算机时间为未来2000秒,这时key会立即失效,而不是等1000秒之后。

Redis如何淘汰过期的keys

Redis keys过期有两种方式:被动和主动方式。

当一些客户端尝试访问它时,key会被发现并主动的过期。

当然,这样是不够的,因为有些过期的keys,永远不会访问他们。 无论如何,这些keys应该过期,所以定时随机测试设置keys的过期时间。所有这些过期的keys将会从密钥空间删除。

具体就是Redis每秒10次做的事情:

  1. 测试随机的20个keys进行相关过期检测。
  2. 删除所有已经过期的keys。
  3. 如果有多于25%的keys过期,重复步奏1.

这是一个平凡的概率算法,基本上的假设是,我们的样本是这个密钥控件,并且我们不断重复过期检测,直到过期的keys的百分百低于25%,这意味着,在任何给定的时刻,最多会清除1/4的过期keys。

在复制AOF文件时如何处理过期

为了获得正确的行为而不牺牲一致性,当一个key过期,DEL将会随着AOF文字一起合成到所有附加的slaves。在master实例中,这种方法是集中的,并且不存在一致性错误的机会。

然而,当slaves连接到master时,不会独立过期keys(会等到master执行DEL命令),他们任然会在数据集里面存在,所以当slave当选为master时淘汰keys会独立执行,然后成为master。

扩展

绝对时间点过期

相对时间点过期

时钟轮算法


redis线程模型有哪些,单线程为什么快

IO模型维度的特征

IO模型使用了多路复用器,在linux系统中使用的是EPOLL

类似netty的BOSS,WORKER使用一个EventLoopGroup(threads=1)

单线程的Reactor模型,每次循环取socket中的命令然后逐一操作,可以保证socket中的指令是按顺序的,不保证不同的socket也就是客户端的命令的顺序性

命令操作在单线程中顺序操作,没有多线程的困扰不需要锁的复杂度,在操作数据上相对来说是原子性质的

架构设计模型

自身的内存存储数据,读写操作不设计磁盘IO

redis除了提供了Value具备类型还为每种类型实现了一些操作命令

实现了计算向数据移动,而非数据想计算移动,这样在IO的成本上有一定的优势

且在数据结构类型上,丰富了一些统计类属性,读写操作中,写操作会O(1)负载度更新length类属性,使得读操作也是O(1)的


缓存雪崩、缓存穿透、缓存击穿在实际中如何处理

缓存穿透

缓存穿透是指查询一个一定不存在的数据,由于缓存是不命中时被动写的,并且出于容错考虑,如果从存储层查不到数据则不写入缓存,这将导致这个不存在的数据每次请求都要到存储层去查询,失去了缓存的意义。在流量大时,可能DB就挂掉了,要是有人利用不存在的key频繁攻击我们的应用,这就是漏洞。

解决方案

有很多种方法可以有效地解决缓存穿透问题,最常见的则是采用布隆过滤器,将所有可能存在的数据哈希到一个足够大的bitmap中,一个一定不存在的数据会被 这个bitmap拦截掉,从而避免了对底层存储系统的查询压力。另外也有一个更为简单粗暴的方法(我们采用的就是这种),如果一个查询返回的数据为空(不管是数 据不存在,还是系统故障),我们仍然把这个空结果进行缓存,但它的过期时间会很短,最长不超过五分钟。

缓存击穿

对于一些设置了过期时间的key,如果这些key可能会在某些时间点被超高并发地访问,是一种非常“热点”的数据。这个时候,需要考虑一个问题:缓存被“击穿”的问题,这个和缓存雪崩的区别在于这里针对某一key缓存,前者则是很多key。

缓存在某个时间点过期的时候,恰好在这个时间点对这个Key有大量的并发请求过来,这些请求发现缓存过期一般都会从后端DB加载数据并回设到缓存,这个时候大并发的请求可能会瞬间把后端DB压垮。

解决方案

缓存失效时的雪崩效应对底层系统的冲击非常可怕。大多数系统设计者考虑用加锁或者队列的方式保证缓存的单线 程(进程)写,从而避免失效时大量的并发请求落到底层存储系统上。这里分享一个简单方案就时讲缓存失效时间分散开,比如我们可以在原有的失效时间基础上增加一个随机值,比如1-5分钟随机,这样每一个缓存的过期时间的重复率就会降低,就很难引发集体失效的事件。

缓存雪崩

缓存雪崩是指在我们设置缓存时采用了相同的过期时间,导致缓存在某一时刻同时失效,请求全部转发到DB,DB瞬时压力过重雪崩。

解决方案

1.使用互斥锁(mutex key)
业界比较常用的做法,是使用mutex。简单地来说,就是在缓存失效的时候(判断拿出来的值为空),不是立即去load db,而是先使用缓存工具的某些带成功操作返回值的操作(比如Redis的SETNX或者Memcache的ADD)去set一个mutex key,当操作返回成功时,再进行load db的操作并回设缓存;否则,就重试整个get缓存的方法。
SETNX,是「SET if Not eXists」的缩写,也就是只有不存在的时候才设置,可以利用它来实现锁的效果。在redis2.6.1之前版本未实现setnx的过期时间

2."提前"使用互斥锁(mutex key):
在value内部设置1个超时值(timeout1), timeout1比实际的memcache timeout(timeout2)小。当从cache读取到timeout1发现它已经过期时候,马上延长timeout1并重新设置到cache。然后再从数据库加载数据并设置到cache中。

3.“永远不过期”:
这里的“永远不过期”包含两层意思:

(1) 从redis上看,确实没有设置过期时间,这就保证了,不会出现热点key过期问题,也就是“物理”不过期。

(2) 从功能上看,如果不过期,那不就成静态的了吗?所以我们把过期时间存在key对应的value里,如果发现要过期了,通过一个后台的异步线程进行缓存的构建,也就是“逻辑”过期

从实战看,这种方法对于性能非常友好,唯一不足的就是构建缓存时候,其余线程(非构建缓存的线程)可能访问的是老数据,但是对于一般的互联网功能来说这个还是可以忍受。

总结

穿透:缓存不存在,数据库不存在,高并发,少量key

击穿:缓存不存在,数据库存在,高并发,少量key

雪崩:缓存不存在,数据库存在,高并发,大量key

语义有些许差异,但是,都可以使用限流的互斥锁,保障数据库的稳定


redis事务是怎么实现的

MULTI 、 EXEC 、 DISCARD 和 WATCH 是 Redis 事务相关的命令。事务可以一次执行多个命令, 并且带有以下两个重要的保证:

事务是一个单独的隔离操作:事务中的所有命令都会序列化、按顺序地执行。事务在执行的过程中,不会被其他客户端发送来的命令请求所打断。

事务是一个原子操作:事务中的命令要么全部被执行,要么全部都不执行。

EXEC 命令负责触发并执行事务中的所有命令:

如果客户端在使用 MULTI 开启了一个事务之后,却因为断线而没有成功执行 EXEC ,那么事务中的所有命令都不会被执行。
另一方面,如果客户端成功在开启事务之后执行 EXEC ,那么事务中的所有命令都会被执行。

当使用 AOF 方式做持久化的时候, Redis 会使用单个 write(2) 命令将事务写入到磁盘中。

然而,如果 Redis 服务器因为某些原因被管理员杀死,或者遇上某种硬件故障,那么可能只有部分事务命令会被成功写入到磁盘中。

如果 Redis 在重新启动时发现 AOF 文件出了这样的问题,那么它会退出,并汇报一个错误。

使用redis-check-aof程序可以修复这一问题:它会移除 AOF 文件中不完整事务的信息,确保服务器可以顺利启动。

从 2.2 版本开始,Redis 还可以通过乐观锁(optimistic lock)实现 CAS (check-and-set)操作,具体信息请参考文档的后半部分。

事务中的错误

使用事务时可能会遇上以下两种错误:

事务在执行 EXEC 之前,入队的命令可能会出错。比如说,命令可能会产生语法错误(参数数量错误,参数名错误,等等),或者其他更严重的错误,比如内存不足(如果服务器使用 maxmemory 设置了最大内存限制的话)。
命令可能在 EXEC 调用之后失败。举个例子,事务中的命令可能处理了错误类型的键,比如将列表命令用在了字符串键上面,诸如此类。

对于发生在 EXEC 执行之前的错误,客户端以前的做法是检查命令入队所得的返回值:如果命令入队时返回 QUEUED ,那么入队成功;否则,就是入队失败。如果有命令在入队时失败,那么大部分客户端都会停止并取消这个事务。

不过,从 Redis 2.6.5 开始,服务器会对命令入队失败的情况进行记录,并在客户端调用 EXEC 命令时,拒绝执行并自动放弃这个事务。

在 Redis 2.6.5 以前, Redis 只执行事务中那些入队成功的命令,而忽略那些入队失败的命令。 而新的处理方式则使得在流水线(pipeline)中包含事务变得简单,因为发送事务和读取事务的回复都只需要和服务器进行一次通讯。

至于那些在 EXEC 命令执行之后所产生的错误, 并没有对它们进行特别处理: 即使事务中有某个/某些命令在执行时产生了错误, 事务中的其他命令仍然会继续执行。

为什么 Redis 不支持回滚(roll back)

如果你有使用关系式数据库的经验, 那么 “Redis 在事务失败时不进行回滚,而是继续执行余下的命令”这种做法可能会让你觉得有点奇怪。

以下是这种做法的优点:

  • Redis 命令只会因为错误的语法而失败(并且这些问题不能在入队时发现),或是命令用在了错误类型的键上面:这也就是说,从实用性的角度来说,失败的命令是由编程错误造成的,而这些错误应该在开发的过程中被发现,而不应该出现在生产环境中。
  • 因为不需要对回滚进行支持,所以 Redis 的内部可以保持简单且快速。

有种观点认为 Redis 处理事务的做法会产生 bug , 然而需要注意的是, 在通常情况下, 回滚并不能解决编程错误带来的问题。 举个例子, 如果你本来想通过 INCR 命令将键的值加上 1 , 却不小心加上了 2 , 又或者对错误类型的键执行了 INCR , 回滚是没有办法处理这些情况的。


redis集群方案有哪些

常见集群分类

主从复制集群

分片集群

redis有那些:

主从复制集群,手动切换

带有哨兵的HA的主从复制集群

客户端实现路由索引的分片集群

使用中间件代理层的分片集群

redis自身实现的cluster分片集群


redis主从复制的原理是什么

主从复制机制

当一个 master 实例和一个 slave 实例连接正常时, master 会发送一连串的命令流来保持对 slave 的更新,以便于将自身数据集的改变复制给 slave , :包括客户端的写入、key 的过期或被逐出等等。

当 master 和 slave 之间的连接断开之后,因为网络问题、或者是主从意识到连接超时, slave 重新连接上 master 并会尝试进行部分重同步:这意味着它会尝试只获取在断开连接期间内丢失的命令流。

当无法进行部分重同步时, slave 会请求进行全量重同步。这会涉及到一个更复杂的过程,例如 master 需要创建所有数据的快照,将之发送给 slave ,之后在数据集更改时持续发送命令流到 slave 。

主从复制的关注点

Redis 使用异步复制,slave 和 master 之间异步地确认处理的数据量

一个 master 可以拥有多个 slave

slave 可以接受其他 slave 的连接。除了多个 slave 可以连接到同一个 master 之外, slave 之间也可以像层叠状的结构(cascading-like structure)连接到其他 slave 。自 Redis 4.0 起,所有的 sub-slave 将会从 master 收到完全一样的复制流。

Redis 复制在 master 侧是非阻塞的。这意味着 master 在一个或多个 slave 进行初次同步或者是部分重同步时,可以继续处理查询请求。

复制在 slave 侧大部分也是非阻塞的。当 slave 进行初次同步时,它可以使用旧数据集处理查询请求,假设你在 redis.conf 中配置了让 Redis 这样做的话。否则,你可以配置如果复制流断开, Redis slave 会返回一个 error 给客户端。但是,在初次同步之后,旧数据集必须被删除,同时加载新的数据集。 slave 在这个短暂的时间窗口内(如果数据集很大,会持续较长时间),会阻塞到来的连接请求。自 Redis 4.0 开始,可以配置 Redis 使删除旧数据集的操作在另一个不同的线程中进行,但是,加载新数据集的操作依然需要在主线程中进行并且会阻塞 slave 。

复制既可以被用在可伸缩性,以便只读查询可以有多个 slave 进行(例如 O(N) 复杂度的慢操作可以被下放到 slave ),或者仅用于数据安全。

可以使用复制来避免 master 将全部数据集写入磁盘造成的开销:一种典型的技术是配置你的 master Redis.conf 以避免对磁盘进行持久化,然后连接一个 slave ,其配置为不定期保存或是启用 AOF。但是,这个设置必须小心处理,因为重新启动的 master 程序将从一个空数据集开始:如果一个 slave 试图与它同步,那么这个 slave 也会被清空。
任何时候数据安全性都是很重要的,所以如果 master 使用复制功能的同时未配置持久化,那么自动重启进程这项应该被禁用。

Redis 复制功能是如何工作的

每一个 Redis master 都有一个 replication ID :这是一个较大的伪随机字符串,标记了一个给定的数据集。每个 master 也持有一个偏移量,master 将自己产生的复制流发送给 slave 时,发送多少个字节的数据,自身的偏移量就会增加多少,目的是当有新的操作修改自己的数据集时,它可以以此更新 slave 的状态。复制偏移量即使在没有一个 slave 连接到 master 时,也会自增,所以基本上每一对给定的

Replication ID, offset

都会标识一个 master 数据集的确切版本。

当 slave 连接到 master 时,它们使用 PSYNC 命令来发送它们记录的旧的 master replication ID 和它们至今为止处理的偏移量。通过这种方式, master 能够仅发送 slave 所需的增量部分。但是如果 master 的缓冲区中没有足够的命令积压缓冲记录,或者如果 slave 引用了不再知道的历史记录(replication ID),则会转而进行一个全量重同步:在这种情况下, slave 会得到一个完整的数据集副本,从头开始。

下面是一个全量同步的工作细节:

master 开启一个后台保存进程,以便于生产一个 RDB 文件。同时它开始缓冲所有从客户端接收到的新的写入命令。当后台保存完成时, master 将数据集文件传输给 slave, slave将之保存在磁盘上,然后加载文件到内存。再然后 master 会发送所有缓冲的命令发给 slave。这个过程以指令流的形式完成并且和 Redis 协议本身的格式相同。

你可以用 telnet 自己进行尝试。在服务器正在做一些工作的同时连接到 Redis 端口并发出 SYNC 命令。你将会看到一个批量传输,并且之后每一个 master 接收到的命令都将在 telnet 回话中被重新发出。事实上 SYNC 是一个旧协议,在新的 Redis 实例中已经不再被使用,但是其仍然向后兼容:但它不允许部分重同步,所以现在 PSYNC 被用来替代 SYNC。

之前说过,当主从之间的连接因为一些原因崩溃之后, slave 能够自动重连。如果 master 收到了多个 slave 要求同步的请求,它会执行一个单独的后台保存,以便于为多个 slave 服务。

无需磁盘参与的复制

正常情况下,一个全量重同步要求在磁盘上创建一个 RDB 文件,然后将它从磁盘加载进内存,然后 slave以此进行数据同步。

如果磁盘性能很低的话,这对 master 是一个压力很大的操作。Redis 2.8.18 是第一个支持无磁盘复制的版本。在此设置中,子进程直接发送 RDB 文件给 slave,无需使用磁盘作为中间储存介质。


redis缓存如何回收

回收策略

noeviction:返回错误当内存限制达到并且客户端尝试执行会让更多内存被使用的命令(大部分的写入指令,但DEL和几个例外)
allkeys-lru: 尝试回收最少使用的键(LRU),使得新添加的数据有空间存放。
volatile-lru: 尝试回收最少使用的键(LRU),但仅限于在过期集合的键,使得新添加的数据有空间存放。
allkeys-random: 回收随机的键使得新添加的数据有空间存放。
volatile-random: 回收随机的键使得新添加的数据有空间存放,但仅限于在过期集合的键。
volatile-ttl: 回收在过期集合的键,并且优先回收存活时间(TTL)较短的键,使得新添加的数据有空间存放。
volatile-lfu:从所有配置了过期时间的键中驱逐使用频率最少的键
allkeys-lfu:从所有键中驱逐使用频率最少的键

如果没有键满足回收的前提条件的话,策略volatile-lru, volatile-random以及volatile-ttl就和noeviction 差不多了。

选择正确的回收策略是非常重要的,这取决于你的应用的访问模式,不过你可以在运行时进行相关的策略调整,并且监控缓存命中率和没命中的次数,通过RedisINFO命令输出以便调优。

一般的经验规则:

  • 使用allkeys-lru策略:当你希望你的请求符合一个幂定律分布,也就是说,你希望部分的子集元素将比其它其它元素被访问的更多。如果你不确定选择什么,这是个很好的选择。.
  • 使用allkeys-random:如果你是循环访问,所有的键被连续的扫描,或者你希望请求分布正常(所有元素被访问的概率都差不多)。
  • 使用volatile-ttl:如果你想要通过创建缓存对象时设置TTL值,来决定哪些对象应该被过期。

allkeys-lruvolatile-random策略对于当你想要单一的实例实现缓存及持久化一些键时很有用。不过一般运行两个实例是解决这个问题的更好方法。

为了键设置过期时间也是需要消耗内存的,所以使用allkeys-lru这种策略更加高效,因为没有必要为键取设置过期时间当内存有压力时。

回收进程如何工作

理解回收进程如何工作是非常重要的:

  • 一个客户端运行了新的命令,添加了新的数据。
  • Redi检查内存使用情况,如果大于maxmemory的限制, 则根据设定好的策略进行回收。
  • 一个新的命令被执行,等等。
  • 所以我们不断地穿越内存限制的边界,通过不断达到边界然后不断地回收回到边界以下。

如果一个命令的结果导致大量内存被使用(例如很大的集合的交集保存到一个新的键),不用多久内存限制就会被这个内存使用量超越。

RabbitMQ的架构设计是什么样的

是AMQP的实现,相关概念语义

Broker:它提供一种传输服务,它的角色就是维护一条从生产者到消费者的路线,保证数据能按照指定的方式进行传输

Exchange:消息交换机,它指定消息按什么规则,路由到哪个队列。

Queue:消息的载体,每个消息都会被投到一个或多个队列。

Binding:绑定,它的作用就是把exchange和queue按照路由规则绑定起来.

Routing Key:路由关键字,exchange根据这个关键字进行消息投递。

vhost:虚拟主机,一个broker里可以有多个vhost,用作不同用户的权限分离。

Producer:消息生产者,就是投递消息的程序.

Consumer:消息消费者,就是接受消息的程序.

Channel:消息通道,在客户端的每个连接里,可建立多个channel.

核心概念

在mq领域中,producer将msg发送到queue,然后consumer通过消费queue完成P.C解耦

kafka是由producer决定msg发送到那个queue

rabbitmq是由Exchange决定msg应该怎么样发送到目标queue,这就是binding及对应的策略

Exchange

Direct Exchange:直接匹配,通过Exchange名称+RountingKey来发送与接收消息.
Fanout Exchange:广播订阅,向所有的消费者发布消息,但是只有消费者将队列绑定到该路由器才能收到消息,忽略Routing Key.
Topic Exchange:主题匹配订阅,这里的主题指的是RoutingKey,RoutingKey可以采用通配符,如:*或#,RoutingKey命名采用.来分隔多个词,只有消息这将队列绑定到该路由器且指定RoutingKey符合匹配规则时才能收到消息;
Headers Exchange:消息头订阅,消息发布前,为消息定义一个或多个键值对的消息头,然后消费者接收消息同时需要定义类似的键值对请求头:(如:x-mactch=all或者x_match=any),只有请求头与消息头匹配,才能接收消息,忽略RoutingKey.
默认的exchange:如果用空字符串去声明一个exchange,那么系统就会使用”amq.direct”这个exchange,我们创建一个queue时,默认的都会有一个和新建queue同名的routingKey绑定到这个默认的exchange上去

复杂与精简

在众多的MQ中间件中,首先学习Rabbitmq的时候,就理解他是一个单机的mq组件,为了系统的解耦,可以自己在业务层面做AKF

其在内卷能力做的非常出色,这得益于AMQP,也就是消息的传递形式、复杂度有exchange和queue的binding实现,这,对于P.C有很大的帮助


RabbitMQ如何确保消息发送和消息接收

消息发送确认

1 ConfirmCallback方法

ConfirmCallback 是一个回调接口,消息发送到 Broker 后触发回调,确认消息是否到达 Broker 服务器,也就是只确认是否正确到达 Exchange 中。

2 ReturnCallback方法

通过实现 ReturnCallback 接口,启动消息失败返回,此接口是在交换器路由不到队列时触发回调,该方法可以不使用,因为交换器和队列是在代码里绑定的,如果消息成功投递到 Broker 后几乎不存在绑定队列失败,除非你代码写错了。

消息接收确认

RabbitMQ 消息确认机制(ACK)默认是自动确认的,自动确认会在消息发送给消费者后立即确认,但存在丢失消息的可能,如果消费端消费逻辑抛出异常,假如你用回滚了也只是保证了数据的一致性,但是消息还是丢了,也就是消费端没有处理成功这条消息,那么就相当于丢失了消息。

消息确认模式有:

AcknowledgeMode.NONE:自动确认。
AcknowledgeMode.AUTO:根据情况确认。
AcknowledgeMode.MANUAL:手动确认。
消费者收到消息后,手动调用 Basic.Ack 或 Basic.Nack 或 Basic.Reject 后,RabbitMQ 收到这些消息后,才认为本次投递完成。

Basic.Ack 命令:用于确认当前消息。
Basic.Nack 命令:用于否定当前消息(注意:这是AMQP 0-9-1的RabbitMQ扩展) 。
Basic.Reject 命令:用于拒绝当前消息。
Nack,Reject后都有能力要求是否requeue消息或者进入死信队列


RabbitMQ事务消息原理是什么

事务V.S确认

确认是对一件事的确认

事务是对批量的确认

增删改查中,事务是对于增删改的保证

发送方事务

开启事务,发送多条数据,事务提交或回滚是原子的,要么都提交,要么都回滚

消费方事务

消费方是读取行为,那么事务体现在哪里呢

rabbitmq的消费行为会触发queue中msg的是否删除、是否重新放回队列等行为,类增删改

所以,消费方的ack是要手动提交的,且最终确定以事务的提交和回滚决定


RabbitMQ死信队列、延时队列分别是什么

死信队列

DLX(Dead Letter Exchange),死信交换器

当队列中的消息被拒绝、或者过期会变成死信,死信可以被重新发布到另一个交换器,这个交换器就是DLX,与DLX绑定的队列称为死信队列。
造成死信的原因:

  • 信息被拒绝
  • 信息超时
  • 超过了队列的最大长度

过期消息:

在 rabbitmq 中存在2种方可设置消息的过期时间,第一种通过对队列进行设置,这种设置后,该队列中所有的消息都存在相同的过期时间,第二种通过对消息本身进行设置,那么每条消息的过期时间都不一样。如果同时使用这2种方法,那么以过期时间小的那个数值为准。当消息达到过期时间还没有被消费,那么那个消息就成为了一个 死信 消息。

队列设置:在队列申明的时候使用 x-message-ttl 参数,单位为 毫秒

单个消息设置:是设置消息属性的 expiration 参数的值,单位为 毫秒

延迟队列

延迟队列存储的是延迟消息

延迟消息指的是,当消息被发发布出去之后,并不立即投递给消费者,而是在指定时间之后投递。如:

在订单系统中,订单有30秒的付款时间,在订单超时之后在投递给消费者处理超时订单。

rabbitMq没有直接支持延迟队列,可以通过死信队列实现。

在死信队列中,可以为普通交换器绑定多个消息队列,假设绑定过期时间为5分钟,10分钟和30分钟,3个消息队列,然后为每个消息队列设置DLX,为每个DLX关联一个死信队列。

当消息过期之后,被转存到对应的死信队列中,然后投递给指定的消费者消费。


简述kafka架构设计是什么样

语义概念

1 broker
Kafka 集群包含一个或多个服务器,服务器节点称为broker。

broker存储topic的数据。如果某topic有N个partition,集群有N个broker,那么每个broker存储该topic的一个partition。

如果某topic有N个partition,集群有(N+M)个broker,那么其中有N个broker存储该topic的一个partition,剩下的M个broker不存储该topic的partition数据。

如果某topic有N个partition,集群中broker数目少于N个,那么一个broker存储该topic的一个或多个partition。在实际生产环境中,尽量避免这种情况的发生,这种情况容易导致Kafka集群数据不均衡。

2 Topic
每条发布到Kafka集群的消息都有一个类别,这个类别被称为Topic。(物理上不同Topic的消息分开存储,逻辑上一个Topic的消息虽然保存于一个或多个broker上但用户只需指定消息的Topic即可生产或消费数据而不必关心数据存于何处)

类似于数据库的表名

3 Partition
topic中的数据分割为一个或多个partition。每个topic至少有一个partition。每个partition中的数据使用多个segment文件存储。partition中的数据是有序的,不同partition间的数据丢失了数据的顺序。如果topic有多个partition,消费数据时就不能保证数据的顺序。在需要严格保证消息的消费顺序的场景下,需要将partition数目设为1。

4 Producer
生产者即数据的发布者,该角色将消息发布到Kafka的topic中。broker接收到生产者发送的消息后,broker将该消息追加到当前用于追加数据的segment文件中。生产者发送的消息,存储到一个partition中,生产者也可以指定数据存储的partition。

5 Consumer
消费者可以从broker中读取数据。消费者可以消费多个topic中的数据。

6 Consumer Group
每个Consumer属于一个特定的Consumer Group(可为每个Consumer指定group name,若不指定group name则属于默认的group)。这是kafka用来实现一个topic消息的广播(发给所有的consumer)和单播(发给任意一个consumer)的手段。一个topic可以有多个CG。topic的消息会复制-给consumer。如果需要实现广播,只要每个consumer有一个独立的CG就可以了。要实现单播只要所有的consumer在同一个CG。用CG还可以将consumer进行自由的分组而不需要多次发送消息到不同的topic。

7 Leader
每个partition有多个副本,其中有且仅有一个作为Leader,Leader是当前负责数据的读写的partition。

8 Follower
Follower跟随Leader,所有写请求都通过Leader路由,数据变更会广播给所有Follower,Follower与Leader保持数据同步。如果Leader失效,则从Follower中选举出一个新的Leader。当Follower与Leader挂掉、卡住或者同步太慢,leader会把这个follower从“in sync replicas”(ISR)列表中删除,重新创建一个Follower。

9 Offset
kafka的存储文件都是按照offset.kafka来命名,用offset做名字的好处是方便查找。例如你想找位于2049的位置,只要找到2048.kafka的文件即可。当然the first offset就是00000000000.kafka

KAFKA天生是分布式的,满足AKF的XYZ轴特点,扩展性,可靠性,高性能是没得说

而且,kafka具备自己的特色,比如动态ISR集合,是在强一致性,过半一致性之外的另一个实现手段



Kafka消息丢失的场景有哪些

生产者在生产过程中的消息丢失

broker在故障后的消息丢失

消费者在消费过程中的消息丢失

ACK机制

ack有3个可选值,分别是1,0,-1。

ack=0:生产者在生产过程中的消息丢失

简单来说就是,producer发送一次就不再发送了,不管是否发送成功。

ack=1:broker在故障后的消息丢失

简单来说就是,producer只要收到一个分区副本成功写入的通知就认为推送消息成功了。这里有一个地方需要注意,这个副本必须是leader副本。只有leader副本成功写入了,producer才会认为消息发送成功。

注意,ack的默认值就是1。这个默认值其实就是吞吐量与可靠性的一个折中方案。生产上我们可以根据实际情况进行调整,比如如果你要追求高吞吐量,那么就要放弃可靠性。

ack=-1:生产侧和存储侧不会丢失数据

简单来说就是,producer只有收到分区内所有副本的成功写入的通知才认为推送消息成功了。

Offset机制

kafka消费者的三种消费语义

at-most-once:最多一次,可能丢数据

at-least-once:最少一次,可能重复消费数据

exact-once message:精确一次


Kafka是pull?push?以及优劣势分析

Kafka最初考虑的问题是,customer应该从brokes拉取消息还是brokers将消息推送到consumer,也就是pull还push。

Kafka遵循了一种大部分消息系统共同的传统的设计:producer将消息推送到broker,consumer从broker拉取消息。

一些消息系统比如Scribe和Apache Flume采用了push模式,将消息推送到下游的consumer。

这样做有好处也有坏处:由broker决定消息推送的速率,对于不同消费速率的consumer就不太好处理了。

消息系统都致力于让consumer以最大的速率最快速的消费消息,但不幸的是,push模式下,当broker推送的速率远大于consumer消费的速率时,consumer恐怕就要崩溃了。

最终Kafka还是选取了传统的pull模式。

Pull模式的另外一个好处是consumer可以自主决定是否批量的从broker拉取数据。

Push模式必须在不知道下游consumer消费能力和消费策略的情况下决定是立即推送每条消息还是缓存之后批量推送。

如果为了避免consumer崩溃而采用较低的推送速率,将可能导致一次只推送较少的消息而造成浪费。

Pull模式下,consumer就可以根据自己的消费能力去决定这些策略。

Pull有个缺点是,如果broker没有可供消费的消息,将导致consumer不断在循环中轮询,直到新消息到达。

为了避免这点,Kafka有个参数可以让consumer阻塞知道新消息到达(当然也可以阻塞知道消息的数量达到某个特定的量这样就可以批量发


Kafka中zk的作用是什么

Zookeeper是分布式协调,注意它不是数据库

kafka中使用了zookeeper的分布式锁和分布式配置及统一命名的分布式协调解决方案

在kafka的broker集群中的controller的选择,是通过zk的临时节点争抢获得的

brokerID等如果自增的话也是通过zk的节点version实现的全局唯一

kafka中broker中的状态数据也是存储在zk中,不过这里要注意,zk不是数据库,所以存储的属于元数据

而,新旧版本变化中,就把曾经的offset从zk中迁移出了zk


Kafka中高性能如何保障

首先,性能的最大瓶颈依然是IO,这个是不能逾越的鸿沟

虽然,broker在持久化数据的时候已经最大努力的使用了磁盘的顺序读写

更进一步的性能优化是零拷贝的使用,也就是从磁盘日志到消费者客户端的数据传递,因为kafka是mq,对于msg不具备加工处理,所以得以实现

然后就是大多数分布式系统一样,总要做tradeoff,在速度与可用性/可靠性中挣扎

ACK的0,1,-1级别就是在性能和可靠中权衡


kafka的rebalance机制是什么

消费者分区分配策略

Range 范围分区(默认的)

RoundRobin 轮询分区

Sticky策略

触发 Rebalance 的时机

Rebalance 的触发条件有3个。

  • 组成员个数发生变化。例如有新的 consumer 实例加入该消费组或者离开组。
  • 订阅的 Topic 个数发生变化。
  • 订阅 Topic 的分区数发生变化。

Coordinator协调过程

消费者如何发现协调者

消费者如何确定分配策略

如果需要再均衡分配策略的影响


zk的数据模型和节点类型有哪些

ZooKeeper数据模型

ZooKeeper的数据模型,在结构上和标准文件系统的非常相似,拥有一个层次的命名空间,都是采用树形层次结构,ZooKeeper树中的每个节点被称为—Znode。

和文件系统的目录树一样,ZooKeeper树中的每个节点可以拥有子节点。但也有不同之处:

	Znode兼具文件和目录两种特点。既像文件一样维护着数据、元信息、ACL、时间戳等数据结构,又像目录一样可以作为路径标识的一部分,并可以具有子Znode。用户对Znode具有增、删、改、查等操作(权限允许的情况下)
	
	Znode具有原子性操作,读操作将获取与节点相关的所有数据,写操作也将替换掉节点的所有数据。另外,每一个节点都拥有自己的ACL(访问控制列表),这个列表规定了用户的权限,即限定了特定用户对目标节点可以执行的操作
	
	Znode存储数据大小有限制。ZooKeeper虽然可以关联一些数据,但并没有被设计为常规的数据库或者大数据存储,相反的是,它用来管理调度数据,比如分布式应用中的配置文件信息、状态信息、汇集位置等等。这些数据的共同特性就是它们都是很小的数据,通常以KB为大小单位。ZooKeeper的服务器和客户端都被设计为严格检查并限制每个Znode的数据大小至多1M,当时常规使用中应该远小于此值
	
	Znode通过路径引用,如同Unix中的文件路径。路径必须是绝对的,因此他们必须由斜杠字符来开头。除此以外,他们必须是唯一的,也就是说每一个路径只有一个表示,因此这些路径不能改变。在ZooKeeper中,路径由Unicode字符串组成,并且有一些限制。字符串"/zookeeper"用以保存管理信息,比如关键配额信息。

节点类型

Znode有两种,分别为临时节点和永久节点。
节点的类型在创建时即被确定,并且不能改变。
临时节点:该节点的生命周期依赖于创建它们的会话。一旦会话结束,临时节点将被自动删除,当然可以也可以手动删除。临时节点不允许拥有子节点。

永久节点:该节点的生命周期不依赖于会话,并且只有在客户端显示执行删除操作的时候,他们才能被删除。
  
Znode还有一个序列化的特性,如果创建的时候指定的话,该Znode的名字后面会自动追加一个不断增加的序列号。序列号对于此节点的父节点来说是唯一的,这样便会记录每个子节点创建的先后顺序。它的格式为“%10d”(10位数字,没有数值的数位用0补充,例如“0000000001”)

在ZooKeeper中,每个数据节点都是有生命周期的,其生命周期的长短取决于数据节点的节点类型。

1、持久节点(PERSISTENT)

该数据节点别创建后,就会一直存在于ZooKeeper服务器上,直到有删除操作来主动删除该节点。

2、持久顺序节点(PERSISTENT_SEQUENTIAL)

持久顺序节点的基本特性和持久节点是一致的,额外的特性表现在顺序性上。在ZooKeeper中,每个父节点都会为它的第一级子节点维护一份顺序,用于记录每个子节点创建的先后顺序。

3、临时节点(EPHEMERAL)

临时节点的生命周期和客户端的回话绑定在一起,如果客户端会话失效,那么这个节点就会被自动地清理掉。

ZooKeeper规定了不能基于临时节点来创建子节点,即临时节点只能作为叶子节点。

4、临时顺序节点(EPHEMERAL_SEQUENTIAL)


Zookeeper watch机制是什么

ZooKeeper是用来协调(同步)分布式进程的服务,提供了一个简单高性能的协调内核,用户可以在此之上构建更多复杂的分布式协调功能。

多个分布式进程通过ZooKeeper提供的API来操作共享的ZooKeeper内存数据对象ZNode来达成某种一致的行为或结果,这种模式本质上是基于状态共享的并发模型,与Java的多线程并发模型一致,他们的线程或进程都是”共享式内存通信“。

Java没有直接提供某种响应式通知接口来监控某个对象状态的变化,只能要么浪费CPU时间毫无响应式的轮询重试,或基于Java提供的某种主动通知(Notif)机制(内置队列)来响应状态变化,但这种机制是需要循环阻塞调用。

而ZooKeeper实现这些分布式进程的状态(ZNode的Data、Children)共享时,基于性能的考虑采用了类似的异步非阻塞的主动通知模式即Watch机制,使得分布式进程之间的“共享状态通信”更加实时高效,其实这也是ZooKeeper的主要任务决定的—协调。Consul虽然也实现了Watch机制,但它是阻塞的长轮询。

ZooKeeper的Watch特性

  1. Watch是一次性的,每次都需要重新注册,并且客户端在会话异常结束时不会收到任何通知,而快速重连接时仍不影响接收通知。
  2. Watch的回调执行都是顺序执行的,并且客户端在没有收到关注数据的变化事件通知之前是不会看到最新的数据,另外需要注意不要在Watch回调逻辑中阻塞整个客户端的Watch回调
  3. Watch是轻量级的,WatchEvent是最小的通信单元,结构上只包含通知状态、事件类型和节点路径。ZooKeeper服务端只会通知客户端发生了什么,并不会告诉具体内容。

Zookeeper状态

Disconnected:客户端是断开连接的状态,不能连接服务集合中的任意一个
SyncConnected:客户端是连接状态,连接其中的一个服务
AuthFailed:鉴权失败
ConnectedReadOnly:客户端连接只读的服务器
SaslAuthenticated:SASL认证
Expired:服务器已经过期了该客户端的Session

Zookeeper事件类型

None:无
NodeCreated:节点创建
NodeDeleted:节点删除
NodeDataChanged:节点数据改变
NodeChildrenChanged:子节点改变(添加/删除)

Watcher使用的注意事项

Watcher是一次触发器,假如需要持续监听数据变更,需要在每次获取时设置Watcher
会话过期:当客户端会话过期时,该客户端注册的Watcher会失效
事件丢失:在接收通知和注册监视点之间,可能会丢失事件,但Zookeeper的状态变更和数据变化,都会记录在状态元数据信息和ZK数据节点上,所以能够获取最终一致的ZK信息状态
避免Watcher过多:服务器会对每一个注册Watcher事件的客户端发送通知,通知通过Socket连接的方式发送,当Watcher过多时,会产生一个尖峰的通知

zk的命名服务、配置管理、集群管理分别是什么

分布式协调

大于等于一的情况下,才会有协调,在协调的事务进行分类得到一些名词,语义能够接受就可以

命名服务

通过使用有序节点的特性做到协调命名规则

通过zk的事务ID递增,做到有序行命名规则

通过使用自己点做map映射,做到1:N的命名映射,比如DNS

顺序关系、映射关系

配置管理

配置、元数据、状态等语义可以通过ZK的节点1MB存储,或者通过zk的节点目录结构特性存储

并且通过watch机制,满足配置变化的全局通知能力

集群管理

通过zk的排他性,有序性

满足分布式锁、分布式选主、队列锁

串行化回调调度

分布式调度等

找到2048.kafka的文件即可。当然the first offset就是00000000000.kafka


KAFKA天生是分布式的,满足AKF的XYZ轴特点,扩展性,可靠性,高性能是没得说

而且,kafka具备自己的特色,比如动态ISR集合,是在强一致性,过半一致性之外的另一个实现手段





---



---

# Kafka消息丢失的场景有哪些

生产者在生产过程中的消息丢失

broker在故障后的消息丢失

消费者在消费过程中的消息丢失

## ACK机制

ack有3个可选值,分别是1,0,-1。

## ack=0:生产者在生产过程中的消息丢失

简单来说就是,producer发送一次就不再发送了,不管是否发送成功。

## ack=1:broker在故障后的消息丢失

简单来说就是,producer只要收到一个分区副本成功写入的通知就认为推送消息成功了。这里有一个地方需要注意,这个副本必须是leader副本。只有leader副本成功写入了,producer才会认为消息发送成功。

注意,ack的默认值就是1。这个默认值其实就是吞吐量与可靠性的一个折中方案。生产上我们可以根据实际情况进行调整,比如如果你要追求高吞吐量,那么就要放弃可靠性。

## ack=-1:生产侧和存储侧不会丢失数据

简单来说就是,producer只有收到分区内所有副本的成功写入的通知才认为推送消息成功了。

## Offset机制

kafka消费者的三种消费语义

at-most-once:最多一次,可能丢数据

at-least-once:最少一次,可能重复消费数据

exact-once message:精确一次





















---

# Kafka是pull?push?以及优劣势分析



Kafka最初考虑的问题是,customer应该从brokes拉取消息还是brokers将消息推送到consumer,也就是pull还push。

Kafka遵循了一种大部分消息系统共同的传统的设计:producer将消息推送到broker,consumer从broker拉取消息。

一些消息系统比如Scribe和Apache Flume采用了push模式,将消息推送到下游的consumer。

这样做有好处也有坏处:由broker决定消息推送的速率,对于不同消费速率的consumer就不太好处理了。

消息系统都致力于让consumer以最大的速率最快速的消费消息,但不幸的是,push模式下,当broker推送的速率远大于consumer消费的速率时,consumer恐怕就要崩溃了。

最终Kafka还是选取了传统的pull模式。

Pull模式的另外一个好处是consumer可以自主决定是否批量的从broker拉取数据。

Push模式必须在不知道下游consumer消费能力和消费策略的情况下决定是立即推送每条消息还是缓存之后批量推送。

如果为了避免consumer崩溃而采用较低的推送速率,将可能导致一次只推送较少的消息而造成浪费。

Pull模式下,consumer就可以根据自己的消费能力去决定这些策略。

Pull有个缺点是,如果broker没有可供消费的消息,将导致consumer不断在循环中轮询,直到新消息到达。

为了避免这点,Kafka有个参数可以让consumer阻塞知道新消息到达(当然也可以阻塞知道消息的数量达到某个特定的量这样就可以批量发



















---

# Kafka中zk的作用是什么

Zookeeper是分布式协调,注意它不是数据库

kafka中使用了zookeeper的分布式锁和分布式配置及统一命名的分布式协调解决方案

在kafka的broker集群中的controller的选择,是通过zk的临时节点争抢获得的

brokerID等如果自增的话也是通过zk的节点version实现的全局唯一

kafka中broker中的状态数据也是存储在zk中,不过这里要注意,zk不是数据库,所以存储的属于元数据

而,新旧版本变化中,就把曾经的offset从zk中迁移出了zk















---

# Kafka中高性能如何保障

首先,性能的最大瓶颈依然是IO,这个是不能逾越的鸿沟

虽然,broker在持久化数据的时候已经最大努力的使用了磁盘的顺序读写

更进一步的性能优化是零拷贝的使用,也就是从磁盘日志到消费者客户端的数据传递,因为kafka是mq,对于msg不具备加工处理,所以得以实现

然后就是大多数分布式系统一样,总要做tradeoff,在速度与可用性/可靠性中挣扎

ACK的0,1,-1级别就是在性能和可靠中权衡















---

# kafka的rebalance机制是什么

## 消费者分区分配策略

Range 范围分区(默认的)

RoundRobin 轮询分区

Sticky策略

## 触发 Rebalance 的时机

Rebalance 的触发条件有3个。

- 组成员个数发生变化。例如有新的 consumer 实例加入该消费组或者离开组。
- 订阅的 Topic 个数发生变化。
- 订阅 Topic 的分区数发生变化。

## Coordinator协调过程

消费者如何发现协调者

消费者如何确定分配策略

如果需要再均衡分配策略的影响















---

# zk的数据模型和节点类型有哪些

## ZooKeeper数据模型

ZooKeeper的数据模型,在结构上和标准文件系统的非常相似,拥有一个层次的命名空间,都是采用树形层次结构,ZooKeeper树中的每个节点被称为—Znode。

和文件系统的目录树一样,ZooKeeper树中的每个节点可以拥有子节点。但也有不同之处:

Znode兼具文件和目录两种特点。既像文件一样维护着数据、元信息、ACL、时间戳等数据结构,又像目录一样可以作为路径标识的一部分,并可以具有子Znode。用户对Znode具有增、删、改、查等操作(权限允许的情况下)

Znode具有原子性操作,读操作将获取与节点相关的所有数据,写操作也将替换掉节点的所有数据。另外,每一个节点都拥有自己的ACL(访问控制列表),这个列表规定了用户的权限,即限定了特定用户对目标节点可以执行的操作

Znode存储数据大小有限制。ZooKeeper虽然可以关联一些数据,但并没有被设计为常规的数据库或者大数据存储,相反的是,它用来管理调度数据,比如分布式应用中的配置文件信息、状态信息、汇集位置等等。这些数据的共同特性就是它们都是很小的数据,通常以KB为大小单位。ZooKeeper的服务器和客户端都被设计为严格检查并限制每个Znode的数据大小至多1M,当时常规使用中应该远小于此值

Znode通过路径引用,如同Unix中的文件路径。路径必须是绝对的,因此他们必须由斜杠字符来开头。除此以外,他们必须是唯一的,也就是说每一个路径只有一个表示,因此这些路径不能改变。在ZooKeeper中,路径由Unicode字符串组成,并且有一些限制。字符串"/zookeeper"用以保存管理信息,比如关键配额信息。

## 节点类型

Znode有两种,分别为临时节点和永久节点。
节点的类型在创建时即被确定,并且不能改变。
临时节点:该节点的生命周期依赖于创建它们的会话。一旦会话结束,临时节点将被自动删除,当然可以也可以手动删除。临时节点不允许拥有子节点。

永久节点:该节点的生命周期不依赖于会话,并且只有在客户端显示执行删除操作的时候,他们才能被删除。
  
Znode还有一个序列化的特性,如果创建的时候指定的话,该Znode的名字后面会自动追加一个不断增加的序列号。序列号对于此节点的父节点来说是唯一的,这样便会记录每个子节点创建的先后顺序。它的格式为“%10d”(10位数字,没有数值的数位用0补充,例如“0000000001”)

在ZooKeeper中,每个数据节点都是有生命周期的,其生命周期的长短取决于数据节点的节点类型。

1、持久节点(PERSISTENT)

该数据节点别创建后,就会一直存在于ZooKeeper服务器上,直到有删除操作来主动删除该节点。

2、持久顺序节点(PERSISTENT_SEQUENTIAL)

持久顺序节点的基本特性和持久节点是一致的,额外的特性表现在顺序性上。在ZooKeeper中,每个父节点都会为它的第一级子节点维护一份顺序,用于记录每个子节点创建的先后顺序。

3、临时节点(EPHEMERAL)

临时节点的生命周期和客户端的回话绑定在一起,如果客户端会话失效,那么这个节点就会被自动地清理掉。

ZooKeeper规定了不能基于临时节点来创建子节点,即临时节点只能作为叶子节点。

4、临时顺序节点(EPHEMERAL_SEQUENTIAL)







---

# Zookeeper watch机制是什么

ZooKeeper是用来协调(同步)分布式进程的服务,提供了一个简单高性能的协调内核,用户可以在此之上构建更多复杂的分布式协调功能。

多个分布式进程通过ZooKeeper提供的API来操作共享的ZooKeeper内存数据对象ZNode来达成某种一致的行为或结果,这种模式本质上是基于状态共享的并发模型,与Java的多线程并发模型一致,他们的线程或进程都是”共享式内存通信“。

Java没有直接提供某种响应式通知接口来监控某个对象状态的变化,只能要么浪费CPU时间毫无响应式的轮询重试,或基于Java提供的某种主动通知(Notif)机制(内置队列)来响应状态变化,但这种机制是需要循环阻塞调用。

而ZooKeeper实现这些分布式进程的状态(ZNode的Data、Children)共享时,基于性能的考虑采用了类似的异步非阻塞的主动通知模式即Watch机制,使得分布式进程之间的“共享状态通信”更加实时高效,其实这也是ZooKeeper的主要任务决定的—协调。Consul虽然也实现了Watch机制,但它是阻塞的长轮询。

## ZooKeeper的Watch特性

1. Watch是一次性的,每次都需要重新注册,并且客户端在会话异常结束时不会收到任何通知,而快速重连接时仍不影响接收通知。
2. Watch的回调执行都是顺序执行的,并且客户端在没有收到关注数据的变化事件通知之前是不会看到最新的数据,另外需要注意不要在Watch回调逻辑中阻塞整个客户端的Watch回调
3. Watch是轻量级的,WatchEvent是最小的通信单元,结构上只包含通知状态、事件类型和节点路径。ZooKeeper服务端只会通知客户端发生了什么,并不会告诉具体内容。

## Zookeeper状态

Disconnected:客户端是断开连接的状态,不能连接服务集合中的任意一个
SyncConnected:客户端是连接状态,连接其中的一个服务
AuthFailed:鉴权失败
ConnectedReadOnly:客户端连接只读的服务器
SaslAuthenticated:SASL认证
Expired:服务器已经过期了该客户端的Session

## Zookeeper事件类型

None:无
NodeCreated:节点创建
NodeDeleted:节点删除
NodeDataChanged:节点数据改变
NodeChildrenChanged:子节点改变(添加/删除)

## Watcher使用的注意事项

Watcher是一次触发器,假如需要持续监听数据变更,需要在每次获取时设置Watcher
会话过期:当客户端会话过期时,该客户端注册的Watcher会失效
事件丢失:在接收通知和注册监视点之间,可能会丢失事件,但Zookeeper的状态变更和数据变化,都会记录在状态元数据信息和ZK数据节点上,所以能够获取最终一致的ZK信息状态
避免Watcher过多:服务器会对每一个注册Watcher事件的客户端发送通知,通知通过Socket连接的方式发送,当Watcher过多时,会产生一个尖峰的通知





















# zk的命名服务、配置管理、集群管理分别是什么

## 分布式协调

大于等于一的情况下,才会有协调,在协调的事务进行分类得到一些名词,语义能够接受就可以

## 命名服务

通过使用有序节点的特性做到协调命名规则

通过zk的事务ID递增,做到有序行命名规则

通过使用自己点做map映射,做到1:N的命名映射,比如DNS

顺序关系、映射关系

## 配置管理

配置、元数据、状态等语义可以通过ZK的节点1MB存储,或者通过zk的节点目录结构特性存储

并且通过watch机制,满足配置变化的全局通知能力

## 集群管理

通过zk的排他性,有序性

满足分布式锁、分布式选主、队列锁

串行化回调调度

分布式调度等











Guess you like

Origin blog.csdn.net/m0_47987937/article/details/126703320