Set Java source code analysis of the depth collection

Java collection classes from the Collection interface and Map interfaces derived, in which:

  • List behalf of an ordered collection of elements orderly and repeatable
  • Set on behalf of unordered collection of elements of disorder and can not be repeated
  • Map collection of key-value pair storage

Then this article will discuss the source angle unordered collections Set.

HashSet

HashSet implements the Set interface, backed by a hash table (actually a HashMap instance). It does not guarantee the iteration order of the set; in particular, it does not guarantee that the order lasts forever. Such permit null elements. A look at the following example:

    HashSet<String> hs = new HashSet<String>();

    // 添加元素
    hs.add("hello");
    hs.add("world");
    hs.add("java");
    hs.add("world");
    hs.add(null);
    
    //遍历
    for (String str : hs) {
        System.out.println(str);
    }

Results of the:

null
world
java
hello

The execution results, which allows the addition of null, the elements can not be repeated, and random elements.
Then we thought, how it is to ensure that the elements do not repeat it? It is necessary to analyze its source.
The first is a collection of add HashSet () method:

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

This method calls the map object's put () method, the map objects, what is it?

private transient HashMap<E,Object> map;

You can see, this map is the object HashMap, we continue to see the HashSet constructor:

public HashSet() {
    map = new HashMap<>();
}

Here, it should be able to understand, it is the realization of the underlying HashSet HashMap, so put the call () method is to put the HashMap () method, then we continue to see put () method:

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

put () method calls putVal () method, then the focus is this putVal () method of:

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
         //判断hashmap对象中 tabel属性是否为空--->为空---->resize()
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //发现tab[i] 没有值,直接存入即可
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            //tab[i]有值,分情况讨论
            Node<K,V> e; K k;
            // 如果新插入的元素和table中p元素的hash值,key值相同的话
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            // 如果是红黑树结点的话,进行红黑树插入
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    // 代表这个单链表只有一个头部结点,则直接新建一个结点即可
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        // 链表长度大于8时,将链表转红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果与单向链表上的某个结点key值相同,则跳出循环,此时e是需要修改的结点,p是e的前驱结点
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                     //更新变量p
                    p = e;
                }
            }
            //处理完毕,添加元素
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                //判断是否允许覆盖,并且value是否为空
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;// 更改操作次数
        //如果大于临界值
        if (++size > threshold)
            //将数组大小设置为原来的2倍,并将原先的数组中的元素放到新数组中
            resize();
        afterNodeInsertion(evict);
        return null;
    }

Together we analyze this source code, it will first target assigned to the table tab, and determines whether the tab is empty, this table is a hash table, because that implements the Map interface HashMap Hash table based, if the hash table empty is called a resize () method of open space and assigned to the tab, and the tab length is assigned to n. The next (n - 1) & hash algorithm to calculate and obtain i i-th element of the tab, if no value, it can be stored directly, if there is a value, then there are two cases:

  1. Repeat hash value
  2. Location conflicts

That is, if the key value is found in the addition process is repeated, then put the copied to E p, p on the current position of the element, e is an element that needs to be modified. The position of conflict is divided into several situations:

  • When conflict generation position, table node array following in the form of a single linked list, directly on the last bit is inserted into the linked list node
  • When conflict generation position, key value and the previous node as
  • When conflict generation position, table node array following in the form of red-black tree, it is necessary to find a suitable position in the tree node is inserted

The three cases that need to be made are determined: if p is TreeNode instance (the instanceof TreeNode p), described below p hung red-black tree, need to find a suitable location in the tree is inserted into e. If the p number of results below not more than 8, then p is in the form of a one-way linked list, and then down one by one to find an empty position in the list; if more than 8, p will be converted to red-black tree; if single key value to a node on the same chain, the out of the loop, then e is the need to modify the node, p is the predecessor node e. Finally, is to determine the size of the insert, if it exceeds threshold, then continue to apply space.

Then this is the store on the way HashMap after jdk1.8, which is a linked list structure array + + red-black tree, and before 1.8, HashMap is by an array + list as stored.
So how HashSet element is the only guarantee it? The key lies in judging this one:

if (e.hash == hash && ((k = e.key) == key || key.equals(k)))

It look hashCode () values ​​are the same, if the same, then continue to look equals () method, if the same, then the proof is repeated elements, break out of the loop, the element is not added, if not the same it is being added. So when a custom class that you want to correct HashSet into a collection, you should go to rewrite equals () method and hashCode () method, and the String class have overridden these two methods, so it can put the same character string is removed, leaving only one of them.

Then we continue to see an example below:
custom class students

public class Student {

    private String name;
    private int age;
    
    public Student(String name, int age) {
        this.name = name;
        this.age = age;
    }
    
    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
    
    @Override
    public String toString() {
        return "Student [name=" + name + ", age=" + age + "]";
    }
}

Then write test code:

    HashSet<Student> hs = new HashSet<Student>();
    //添加元素
    Student s = new Student("刘德华",30);
    Student s2 = new Student("陈奕迅",31);
    Student s3 = new Student("周星驰",32);
    Student s4 = new Student("刘德华",30);
    
    hs.add(s);
    hs.add(s2);
    hs.add(s3);
    hs.add(s4);
    //遍历
    for (Student student : hs) {
        System.out.println(student);
    }

In the above code, name and age s4 and s objects are the same, and normally these are two identical objects, can not exist in the HashSet collection, however, we look at the results:

Student [name=周星驰, age=32]
Student [name=刘德华, age=30]
Student [name=陈奕迅, age=31]
Student [name=刘德华, age=30]

If the previous source code analysis we all understand, then I believe we will be able to understand that this is because we do not have to rewrite the hashCode () method and the equals () method, and it will default to call Object methods, so it will I think that every student objects are not the same. That we now rewrite these two methods:

    @Override
    public int hashCode() {
        return 0;
    }

    @Override
    public boolean equals(Object obj) {
        //添加了一条输出语句,用于显示比较次数
        System.out.println(this + "---" + obj);
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof Student)) {
            return false;
        }

        Student s = (Student) obj;
        return this.name.equals(s.name) && this.age == s.age;
    }

Then we run the program:

Student [name=陈奕迅, age=31]---Student [name=刘德华, age=30]
Student [name=周星驰, age=32]---Student [name=刘德华, age=30]
Student [name=周星驰, age=32]---Student [name=陈奕迅, age=31]
Student [name=刘德华, age=30]---Student [name=刘德华, age=30]
Student [name=刘德华, age=30]
Student [name=陈奕迅, age=31]
Student [name=周星驰, age=32]

It can be seen that, although in addition to the repeating element, but more times too much, because the hashCode () method returns a fixed value 0, the determination is performed when the value is always the same so hashCode multiple calls equals () judgment , then we can make as much as possible hashCode values are not the same, then the hash value and what it related?
Because its member variables and objects related to the value, so we can make the following measures:
If the basic types of variables, direct bonus;
if it is a reference type variable, Jia Haxi value.
So the hashCode () modified as follows:

@Override
public int hashCode() {
    //为了避免某种巧合导致两个不相同的对象其计算后返回的hashCode值相同,这里对基本类型age进行一个乘积的运算
    return this.name.hashCode() + this.age * 15;
}

Now run to see results:

Student [name=刘德华, age=30]---Student [name=刘德华, age=30]
Student [name=周星驰, age=32]
Student [name=刘德华, age=30]
Student [name=陈奕迅, age=31]

Repeating elements have been removed successfully, but the number of comparisons in order to reduce time, greatly enhance the process efficiency.

LinkedHashSet

It is a predictable iteration order of hash table and linked list implementation of the Set interface, the entire collection method inherited from the parent class HashSet, but it HashSet only difference is that it has a predictable iteration order, store and retrieve it to comply with the order It is the same. Direct example:

    LinkedHashSet<String> linkedHashSet = new LinkedHashSet<String>();
    //添加元素
    linkedHashSet.add("hello");
    linkedHashSet.add("world");
    linkedHashSet.add("java");
    //遍历
    for (String str : linkedHashSet) {
        System.out.println(str);
    }

operation result:

hello
world
java

TreeSet

It is implemented based on a TreeMap NavigableSet. Using elemental natural order sort elements, sorting or when creating a Comparator provided SET, depending on the method of construction used.
for example:

    TreeSet<Integer> treeSet = new TreeSet<Integer>();
    //添加元素
    treeSet.add(10);
    treeSet.add(26);
    treeSet.add(20);
    treeSet.add(13);
    treeSet.add(3);
    //遍历
    for(Integer i : treeSet) {
        System.out.println(i);
    }

operation result:

3
10
13
20
26

Thus, TreeSet is a sorting function. Note, however, if you use no-argument constructor to create a TreeSet collection, it defaults to using natural ordering of elements; of course you can also pass the comparator to construct a TreeSet.
It is how to achieve a natural ordering of the elements of it? Let's look at the source code analysis:
first of all look at its add () method

public boolean add(E e) {
    return m.put(e, PRESENT)==null;
}

Internal Method calls put () method of the object m, and this m is an NavigableMap objects:

private transient NavigableMap<E,Object> m;

As we continue to follow up put () method, and found it to be an abstract method:

V put(K key, V value);

The method in the Map interface, then we will go to Map interface implementation class, we know, TreeSet is based on the realization TreeMap, so we think that is actually a TreeMap put () method it calls, access to inheritance structure also TreeMap you can confirm this:

java.util 
类 TreeMap<K,V>
java.lang.Object
  继承者 java.util.AbstractMap<K,V>
      继承者 java.util.TreeMap<K,V>
类型参数:
K - 此映射维护的键的类型
V - 映射值的类型
所有已实现的接口: 
Serializable, Cloneable, Map<K,V>, NavigableMap<K,V>, SortedMap<K,V> 

TreeMap indeed achieved NavigableMap interfaces, then we take a look at the TreeMap put () method:

    public V put(K key, V value) {
        Entry<K,V> t = root;
        //创建树的根结点
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        //判断是否拥有比较器
        if (cpr != null) {
            //比较器排序
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            //判断元素是否为空
            if (key == null)
                //抛出异常
                throw new NullPointerException();
            @SuppressWarnings("unchecked")
                //将元素强转为Comparable类型
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }

Let's analyze.
First, it will determine whether the Entry type variable t is empty, then certainly the beginning of the variable is empty, it will go to Entry object is created, we know, TreeMap is to achieve red-black tree-based, so it is actually the creation of the tree root. Then it will have to judge whether the comparator, because we are using the no-argument constructor TreeSet created, so here is certainly no comparator, then he executed else block, we can see that this one Code:

Comparable<? super K> k = (Comparable<? super K>) key;

According to our earlier program analysis, key here is that we pass an Integer object, then it is how strong Integer object can be converted Comparable objects it? After the query document Comparable class, we know that this is an interface that force to achieve its target of each class as a whole sort. This ordering is called natural ordering class, compareTo method of the class referred to as its natural comparison method. Comparable Integer class implements the interface, it may be upcast Integer Comparable objects. Then the object is called the compareTo () method, which returns an int type value, action is: If the Integer equal to Integer parameter, value 0 is returned; if the Integer less than Integer parameters on the digital value less than 0 is returned; Integer Integer If the parameter is greater than the digital values (signed comparison) is greater than 0 is returned. By this method it can be judged that the return value of the magnitude of two numbers. Is less than 0, then on the left (t.left); if more than 0, then on the right (t.right). Say this may be too abstract, we may be further understood by drawing:
Here Insert Picture Description
This rule is stored in a binary tree, the first element as a root node, and then next to each element compared to the root node, the root node is greater than as the right child, root node is less than left as a child; if you already have a position on the element, the element will have to continue to compare with, children than it big as a right, it is smaller than the left as a child, and so on. (If the elements are equal, not stored)
So how elements are taken out of it? Studied the data structure of the students know that there are three binary tree traversal:

  1. Preorder traversal
  2. Preorder
  3. Postorder

That our previous example preorder traversal element extraction (in accordance with the principles of the left, middle and right):
First of all start from the root, root node 10, and then look at its left child, left child is 3, then 3 has been no children, so the first 3 out; so are left to take over, we take the middle, that is, 10; 26 and then take on the right, because there are 26 children, 26 of the 20 so take the left, because there are 20 children left, so 13 third out; 20 this has no children, we take the middle, that is, 20, 26 and finally removed. Finally, the order of the elements is taken: 3,10,13,20,26; This completes the sorting element.

These are the elements that natural order, followed by introduction of the comparator sorting.
Or before the Student class, we write test code:

    TreeSet<Student> treeSet = new TreeSet<Student>();
    // 添加元素
    Student s = new Student("liudehua", 30);
    Student s2 = new Student("chenyixun", 32);
    Student s3 = new Student("zhourunfa", 20);
    Student s4 = new Student("gutianle", 40);
    Student s5 = new Student("zhouxingchi", 29);
    
    treeSet.add(s);
    treeSet.add(s2);
    treeSet.add(s3);
    treeSet.add(s4);
    treeSet.add(s5);
    // 遍历
    for (Student student : treeSet) {
        System.out.println(student);
    }

At this point the program will run error, because the Student class does not implement the Comparable interface.
Because the need to pass a Comparator objects in the constructor TreeSet in, which is an interface, so we have a custom class that implements the interface, then we come to realize a need to sort by name length:

public class MyComparator implements Comparator<Student> {

    @Override
    public int compare(Student o1, Student o2) {
        //根据姓名长度
        int num = o1.getName().length() - o2.getName().length();
        //根据姓名内容
        int num2 = num == 0 ? o1.getName().compareTo(o2.getName()) : num;
        //根据年龄
        int num3 = num2 == 0 ? o1.getAge() - o2.getAge() : num2;
        return num3;
    }
}

Write test code:

    TreeSet<Student> treeSet = new TreeSet<Student>(new MyComparator());
    // 添加元素
    Student s = new Student("liudehua", 30);
    Student s2 = new Student("chenyixun", 32);
    Student s3 = new Student("zhourunfa", 20);
    Student s4 = new Student("gutianle", 40);
    Student s5 = new Student("zhouxingchi", 29);
        
    treeSet.add(s);
    treeSet.add(s2);
    treeSet.add(s3);
    treeSet.add(s4);
    treeSet.add(s5);
        
    // 遍历
    for (Student student : treeSet) {
        System.out.println(student);
    }

operation result:

Student [name=gutianle, age=40]
Student [name=liudehua, age=30]
Student [name=chenyixun, age=32]
Student [name=zhourunfa, age=20]
Student [name=zhouxingchi, age=29]

It can also be achieved by way of an anonymous inner classes.
I hope this article can make you more in-depth understanding Set collection.

Guess you like

Origin www.cnblogs.com/blizzawang/p/11411617.html