Object, ArrayList and LinkedList source code analysis

This article is the first article of JDK source code analysis, let's start with a simple one. This article mainly analyzes the source code of Object, ArrayList and LinkedList. Object is mainly to briefly introduce the generation of hashCode and some basic concepts. Object itself is not very complicated, and it will not ask too much during the interview. , so I won't talk too much. The remaining ArrayList and LinkedList are discussed in detail from the data structure, initialization, addition, deletion, modification, and expansion process.

A brief introduction to Object

Basically, the first thing all newcomers who have just come into contact with java know is that everything is an object, and then it is Object, which also shows the status of object in java. It is the base class of all classes. When a class does not inherit a certain class , the default inheritance is object.

The Object class belongs to the java.lang package. All classes under this package do not need to be imported manually when used, and are automatically imported by the system during program compilation. Note: Regardless of the version of the JDK, the compiler has already woven into the Object during the compilation phase.

We can also look at the class structure diagram of object first.image-20220502101754447

You can see that there are many methods here, we commonly use getClass, toString, equals and the like, and even methods such as toString and equals are usually rewritten by String. Here we will introduce the getClass and hashCode methods respectively. If you are interested in the rest, you can find out by yourself. As for String rewriting, it is relatively simple and you can read it yourself. I only asked about this when I first entered the industry.

The method to get the runtime class - getClass method

We have seen a lot of this method in the framework. When we talked about the Spring framework last time, there are many places where class objects can be obtained through the getClass method, and then all methods and attributes can be obtained, and then these methods and attributes can be operated. , such as when the Bean is initialized and filled with properties.

The getClass method itself is a constant method modified by the native keyword. Here we need to know that we don’t need to consider the method modified by native. The operating system will help us realize it. The method modified by native means to inform the JVM to call. This method is defined externally. We It can be implemented in any language. Simply put, a native Method is an interface for Java to call non-Java code.

public final native Class<?> getClass();

The method of obtaining the hash code-hashCode method

The hashCode method is also a constant method modified by native, and its function is to return the hash code of the object, which is a value of type int. The key value of HashMap we will talk about later needs to be converted into hashCode. The essence of hashCode is to determine the storage address of the object in the hash storage structure.

Java's collection designer uses a hash table to implement it. Hash algorithm, also known as hash algorithm, is to directly assign the result of data to an address according to a specific algorithm. This result is produced by the hashCode method. In this way, when a new element is to be added to the collection, the hashCode method of this element is called first, and the physical location where it should be placed can be located at once. (It is mentioned here that the design of Redis is also the same, and the Redis article will be discussed in detail later)

Regarding the generation and storage of hashCode, a lot of knowledge points about locks are involved here. We will talk about this later in the article about concurrent threads. Here we talk about the conclusion of hashCode generation.

  1. hashCode is not a memory address used directly, but a certain algorithm is used to generate
  2. The hashCode value is stored in the mark word, which shares a bit with the lock, which causes a correlation with the lock state. If it is a biased lock: once the hashcode is called, the biased lock will be revoked, and the hashcode will be saved to occupy the mark word, the object If the object must use the lock at this time, the object will never return to the lock-biased state but will be upgraded to a heavyweight lock. The hashCode is moved to the object monitor of c following the mark word, and it is taken from there.

The implementation logic of hashCode is summarized in the figure below. The generation of hashCode by java can be modified by the parameter -XX: hashcode=1 (JDK8 uses the last one by default)

image-20220502105645072

Array collection-ArrayList

One thing that everyone new to Meng knows: ArrayList is the implementation class of the List interface. Then why is it called an array collection? The reason is that its data structure is essentially an array, and the addition, deletion, modification and query we use are all performed through the subscript index. To analyze this type of object, we start from its construction method.

Class Constructor - No-argument Constructor Method

It is still very common to use ArrayList in collections in our daily life. To understand its structure, we can simply create an ArrayList first.

List<String> list = new ArrayList<>();

A simple no-argument construction of new will do. Here we can follow up the method of no-argument construction.

It can be said that it is very simple here, that is, an empty array is constructed. Note: An empty array is constructed here, that is to say, the length of the array is 0.

transient Object[] elementData;

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {
    
    };

public ArrayList() {
    
    
  this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

Adding elements for the first time and adding elements afterwards - add method

Look at the title, here I separate the first call to add and the subsequent call to add, why, because the data generated by the no-argument construction we called above has no length, and it is impossible for an array without length Elements can be added, so the first time we call it, we need to build a complete array first, then add elements, and then add them later without any operations.

Look at the code and call the add method directly.

The first thing here is to call the ensureCapacityInternal method, and then add the incoming elements to the elementData array and accumulate the length of the array.

private int size;

transient Object[] elementData; 

public boolean add(E e) {
    
    
  ensureCapacityInternal(size + 1);  // Increments modCount!!
  elementData[size++] = e;
  return true;
}

Let's look at the ensureCapacityInternal method again.

Here we need to look at both layers of method calls, first look at the calculateCapacity method.

private void ensureCapacityInternal(int minCapacity) {
    
    
  ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

First judge whether the incoming array is the data generated by the no-argument construction just now. If it is, compare the current length (we added it for the first time, here is 1) and compare it with 10, take the largest and return, otherwise return the current length.

private static final int DEFAULT_CAPACITY = 10;

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    
    
  if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    
    
    return Math.max(DEFAULT_CAPACITY, minCapacity);
  }
  return minCapacity;
}

Continue to look at the ensureExplicitCapacity method.

Here, a modCount variable is added first, and then it is judged whether the current length is greater than the collection length, if yes, it means that the array needs to be expanded, then skip it directly, and return to the add method to continue adding steps.

private void ensureExplicitCapacity(int minCapacity) {
    
    
  modCount++;

  // overflow-conscious code
  if (minCapacity - elementData.length > 0)
    grow(minCapacity);
}

Array collection expansion method - grow method

Continue to look at the grow expansion method.

Here are some algorithms, which at first glance are rather vague, let's take a look at them with numbers,

  1. First of all, 10 is currently passed in,
  2. Then get the current array length and assign it to oldCapacity = 0,
  3. newCapacity = 0 + (0 >> 1) Here is a right shift or equal to 0,
  4. 0 - 10 < 0, so the first judgment here is newCapacity = 10,
  5. The second judgment is generally not entered, here is to judge whether the maximum length is exceeded,
  6. Finally, enter the Arrays.copyOf call, and you don't need to follow it here. This is to create an array and transfer the elements in the original array to the new array.

I don’t know how to calculate the right shift here. You can first convert the number to be shifted to the binary code, then add a 0 to the left, and delete the rightmost array. For example: 4, the binary value is 100, and one digit shifted to the right is 010, that is 10, and then convert binary to decimal is 2. As for the above 0 shifted to the right by one bit, the binary value of 0 is, no matter how much it is shifted to the right, it is still 0.

private void grow(int minCapacity) {
    
    
  // overflow-conscious code
  int oldCapacity = elementData.length;
  int newCapacity = oldCapacity + (oldCapacity >> 1);
  if (newCapacity - minCapacity < 0)
    newCapacity = minCapacity;
  if (newCapacity - MAX_ARRAY_SIZE > 0)
    newCapacity = hugeCapacity(minCapacity);
  // minCapacity is usually close to size, so this is a win:
  elementData = Arrays.copyOf(elementData, newCapacity);
}

To sum up: the addition of elements to arrayList requires expansion and then addition. The expansion algorithm is based on the current length of the array and the number after it is shifted to the right. The initial length of the array without parameter construction is 10, then the length after the first expansion is 15, and the second expansion is 22. The following is based on this analogy.

Delete element - remove method

The implementation of deleting elements is also simple, and it can be realized according to the subscript index, and a subscript index can be directly passed in.

list1.remove(3);

Let's talk about the specific remove method.

First, the rangeCheck method is used to judge whether the current subscript exists, and then obtain the specific element object through elementData, and judge whether the subscript index of the currently deleted element is the last one, otherwise, proceed to judge and move all elements behind the current element forward, such as Delete 2 elements from 1-10 elements, then after this step, the elements of the array should be 1, 3-10, 10, that is to say, the length of the array remains unchanged, but the 2 elements are replaced, and the last element of the array remains unchanged , the remaining elements are moved forward, and then the last position of the array is empty.

public E remove(int index) {
    
    
  rangeCheck(index);

  modCount++;
  E oldValue = elementData(index);

  int numMoved = size - index - 1;
  if (numMoved > 0)
    System.arraycopy(elementData, index+1, elementData, index,
                     numMoved);
  elementData[--size] = null; // clear to let GC do its work

  return oldValue;
}

The above method depends on how to obtain the element object, that is, the query. As for the rangeCheck method in the first step, it is our most common index out-of-bounds exception.

private void rangeCheck(int index) {
    
    
  if (index >= size)
    throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

Query element - get method

In fact, there is nothing to say about these methods of adding, deleting, modifying, and checking. They are all commonly used by us. The query can still call the get method according to the subscript index. Let's look at the specific code.

The first is to judge whether the index is out of bounds, or call the method mentioned above. Then there is the elementData method not finished above to get specific elements.

public E get(int index) {
    
    
  rangeCheck(index);

  return elementData(index);
}

You don’t need to look at this to know, this is a piece of data, just use an array to get the value.

E elementData(int index) {
    
    
	return (E) elementData[index];
}

Modify method - set method

The specific call of this is to pass in an index subscript and the value to be modified.

First check whether the index is out of bounds, and then replace the element by obtaining the specific element.

public E set(int index, E element) {
    
    
	rangeCheck(index);

  E oldValue = elementData(index);
  elementData[index] = element;
  return oldValue;
}

Linked list collection-LinkedList

After talking about ArrayList, let's talk about LinkedList, which is also the implementation class of the list interface, but different from the data structure of ArrayList, LinkedList uses a doubly linked list structure.

So what is a doubly linked list? If you want to know a doubly linked list, you must first understand the linked list. A concept mentioned in the zookeeper series of articles is the task processing chain. The implementation method is that after the execution of the current task thread is completed, it will go to the next thread Add a piece of data to the query queue. When the data is queried in the queue, the next thread will perform the real task processing, and so on.

In other words, a chain is a statement that the current object is associated with the next object. A linked list means that the previous element has a next attribute pointing to the next element. The doubly linked list not only has the next attribute pointing to the next element, but also the pre attribute pointing to the previous element, so that the beginning element can be retrieved downwards, and the ending element can also be retrieved upwards, which is conducive to binary query.

Class constructor - no parameter construction

Still in the same order as the explanation of ArrayList, first look at the no-argument construction method of LinkedList, that is, the normal new one. Interestingly, nothing is done here.

public LinkedList() {
    
    
}

Since nothing is done, how does it store elements? This depends on the object it stores, the node node object.

The element storage attribute item here indicates that any object can be stored, the next attribute stores the next node, and the prev attribute stores the previous node. When the node object is initialized, these three attributes are also required, and there is no no-argument construction here. Simply put, LinkedList is constructed from such objects one by one, so no parameter construction does not require any operations

private static class Node<E> {
    
    
  E item;
  Node<E> next;
  Node<E> prev;

  Node(Node<E> prev, E element, Node<E> next) {
    
    
    this.item = element;
    this.next = next;
    this.prev = prev;
  }
}

Add element - add method

It still has the same method name as above, but the content gap is still quite large.

The add method still continues to call the linkLast method.

public boolean add(E e) {
    
    
  linkLast(e);
  return true;
}

This method first obtains the head node, and then constructs the node node object. Note that the structure here can be seen above. The input parameter is first the previous node, then the element, and finally the next node. If it is the first time to write Enter, the last node obtained is null, so the following will be judged, set the current element as the head node, and then increase the length of the collection. Note that there is another one before the judgment, that is, each node linkedList will regard it as the last node until the next one Nodes are covered.

Then when add is called again, the second element is written, and the obtained previous node has a value, and then the next node attribute of the previous node is assigned as the current node.

void linkLast(E e) {
    
    
  final Node<E> l = last;
  final Node<E> newNode = new Node<>(l, e, null);
  last = newNode;
  if (l == null)
    first = newNode;
  else
    l.next = newNode;
  size++;
  modCount++;
}

It's a bit confusing, we can look at the structure diagram, so that we can clearly see the data structure of linkedList. Because it is such a data structure, there is no such thing as expansion.

image-20220503104854183

Delete element - remove method

Still follow the method name above, look at the specific code.

First of all, the call of the checkElementIndex method is the same as the rangeCheck method of ArrayList above. It is to judge whether the index is out of bounds, and then the call of the unlink method. Note: there is also a node method here, which is to query the current node object.

public E remove(int index) {
    
    
  checkElementIndex(index);
  return unlink(node(index));
}

First look at how the node method gets the current element.

There will be a concept of binary query we mentioned before. We can see that the current array size is first shifted to the right by one bit, that is, if the current size is 5, then the result of the right shift is 2, and the subscript index is entered. is 3, that is, delete the fourth element, then the current node will be queried backwards based on the end node, if the input parameter is 1, then the current node will be searched sequentially from the head node. This is the binary query, split from it, segmented query.

Node<E> node(int index) {
    
    
  // assert isElementIndex(index);

  if (index < (size >> 1)) {
    
    
    Node<E> x = first;
    for (int i = 0; i < index; i++)
      x = x.next;
    return x;
  } else {
    
    
    Node<E> x = last;
    for (int i = size - 1; i > index; i--)
      x = x.prev;
    return x;
  }
}

Look at the unlink method again.

The first is to get all the attribute information of the current node object, and then judge whether the current node is a head node or a tail node. If not, point the next attribute of the previous node to the next node, and point the prve attribute of the next node to the up node, that is to say, the previous node and the next node of the current node are associated with each other, and their own relationship is removed.

Then set the item attribute of the current node to empty, and then reduce the collection size by one.

E unlink(Node<E> x) {
    
    
  // assert x != null;
  final E element = x.item;
  final Node<E> next = x.next;
  final Node<E> prev = x.prev;

  if (prev == null) {
    
    
    first = next;
  } else {
    
    
    prev.next = next;
    x.prev = null;
  }

  if (next == null) {
    
    
    last = prev;
  } else {
    
    
    next.prev = prev;
    x.next = null;
  }

  x.item = null;
  size--;
  modCount++;
  return element;
}

Query element - next method

The query here is no longer based on the subscript index query. Of course, the get method can also be used to query, but it is not recommended to use the subscript index to query, but to use iterative query.

If we use iteration, we will simply look at it.

The first is to get the iterator, and then loop iteratively, which is a normal operation.

Iterator<Object> objectIterator = linkedList.descendingIterator();

while (objectIterator.hasNext()){
    
    
  Object next = objectIterator.next();
  System.out.println(next);
}

Because the remove method above the get query has already talked about the acquisition of node nodes, so I won't talk about it here. In fact, both the iterator and the index of the following table are implemented using the next attribute and the prev attribute of the node node, which is relatively simple.

Modify method - set method

This is simpler. Get the current node through the index, and then directly modify the attribute of the item in the node object. You don't need to read the code, it is the splicing of the above code.

Summarize

This article still has a lot of content. It talks about the general method and process of the three objects. By the way, it is also necessary to mention that you can use the for loop when querying the ArrayList, because it is faster to query the array with the subscript index, do not use iteration and Enhanced for, because the essence of enhanced for is still to use iterators, and linkedList is better to use iterator query or enhanced for, because the next node object can be obtained directly according to the next attribute of the node object, no need to use binary query and the like, iteration will be faster.

Then the first part of jdk source code analysis is over. Later, we will analyze HashMap, Synchronized and ConcurrentHashMap. These three are more complicated, and they are all single chapters. Friends in need can jump according to the link below.

Appendix JDK Source Code Analysis Series Articles

time article
2022-04-25 Object, ArrayList and LinkedList source code analysis
2022-04-27 Detailed interview explanation and source code analysis of HsahMap

Guess you like

Origin blog.csdn.net/qq_39339965/article/details/124590804