Java collection class (four)-TreeSet

TreeSet

 

Collection.png

 

TreeSet is an ordered set, and its function is to provide an ordered set of Set. It inherits the AbstractSet abstract class and implements the NavigableSet<E>, Cloneable, and Serializable interfaces. TreeSet is implemented based on TreeMap. The elements of TreeSet support 2 sorting methods: natural sorting or sorting according to the provided Comparator.

 

TreeSet.png

 

(1) TreeSet inherits from AbstractSet and implements the NavigableSet interface.
(2) TreeSet is a collection that contains ordered and no repeated elements, and is implemented by TreeMap. The TreeSet contains a "NavigableMap type member variable" m, and m is actually an "instance of TreeMap".

TreeSet usage

 

public static void demoOne() {
        TreeSet<Person> ts = new TreeSet<>();
        ts.add(new Person("张三", 11));
        ts.add(new Person("李四", 12));
        ts.add(new Person("王五", 15));
        ts.add(new Person("赵六", 21));
        
        System.out.println(ts);
    }

Execution result: an exception will be thrown: java.lang.ClassCastException
obviously a type conversion exception has occurred. The reason is that we need to tell the TreeSet how to compare elements, if not specified, this exception will be thrown

How to solve:
How to specify the comparison rules, you need to implement the Comparableinterface in the custom class (Person) , and rewrite the compareTo method in the interface

 

public class Person implements Comparable<Person> {
    private String name;
    private int age;
    ...
    public int compareTo(Person o) {
        return 0;                //当compareTo方法返回0的时候集合中只有一个元素
        return 1;                //当compareTo方法返回正数的时候集合会怎么存就怎么取
        return -1;                //当compareTo方法返回负数的时候集合会倒序存储
    }
}

Why return 0, only one element will be stored, return -1 will be stored in reverse order, return 1 will be stored how to fetch it? The reason is that the bottom layer of the TreeSet is actually a binary tree organization, and every time a new element is inserted (except the first one), the compareTo()method is called to compare with the last inserted element, and arranged according to the structure of the binary tree.

  1. If the compareTo()return value is written as 0, the element value is considered to be the same element every time it is compared, and then no new element except the first one is inserted into the TreeSet. So only the first element inserted exists in the TreeSet.
  2. If the compareTo()return value is written as 1, every time the element value is compared, the newly inserted element is considered to be larger than the previous element, so when the binary tree is stored, there will be the right side of the root, and it will be arranged in positive order when read.
  3. If the compareTo()return value is written as -1, every time the element value is compared, the newly inserted element is considered to be smaller than the previous element, so when the binary tree is stored, there will be the left side of the root, and it will be arranged in reverse order when read.

Sample code, requirement: Now we need to formulate the TreeSet to compare String according to the length of String.

 

//定义一个类,实现Comparator接口,并重写compare()方法,
class CompareByLength implements Comparator<String> {

    @Override
    public int compare(String s1, String s2) {        //按照字符串的长度比较
        int num = s1.length() - s2.length();        //长度为主要条件
        return num == 0 ? s1.compareTo(s2) : num;    //内容为次要条件
    }
}

 

 public static void demoTwo() {

        //需求:将字符串按照长度排序
        TreeSet<String> ts = new TreeSet<>(new CompareByLen());        //Comparator c = new CompareByLen();
        ts.add("aaaaaaaa");
        ts.add("z");
        ts.add("wc");
        ts.add("nba");
        ts.add("cba");
        
        System.out.println(ts);
    }

Part of the source code of TreeSet:

 

package java.util;

public class TreeSet<E> extends AbstractSet<E>
    implements NavigableSet<E>, Cloneable, java.io.Serializable
{
    // 使用NavigableMap对象的key来保存Set集合的元素
    private transient NavigableMap<E,Object> m;

    //使用PRESENT作为Map集合中的value
    private static final Object PRESENT = new Object();

    // 不带参数的构造函数。创建一个空的TreeMap
    //以自然排序方法创建一个新的TreeMap,再根据该TreeMap创建一个TreeSet
    //使用该TreeMap的key来保存Set集合的元素
    public TreeSet() {
        this(new TreeMap<E,Object>());
    }

    // 将TreeMap赋值给 "NavigableMap对象m"
    TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }

    //以定制排序的方式创建一个新的TreeMap。根据该TreeMap创建一个TreeSet
    //使用该TreeMap的key来保存set集合的元素
    public TreeSet(Comparator<? super E> comparator) {
        this(new TreeMap<E,Object>(comparator));
    }

    // 创建TreeSet,并将集合c中的全部元素都添加到TreeSet中
    public TreeSet(Collection<? extends E> c) {
        this();
        // 将集合c中的元素全部添加到TreeSet中
        addAll(c);
    }

    // 创建TreeSet,并将s中的全部元素都添加到TreeSet中
    public TreeSet(SortedSet<E> s) {
        this(s.comparator());
        addAll(s);
    }

    // 返回TreeSet的顺序排列的迭代器。
    // 因为TreeSet时TreeMap实现的,所以这里实际上时返回TreeMap的“键集”对应的迭代器
    public Iterator<E> iterator() {
        return m.navigableKeySet().iterator();
    }

    // 返回TreeSet的逆序排列的迭代器。
    // 因为TreeSet时TreeMap实现的,所以这里实际上时返回TreeMap的“键集”对应的迭代器
    public Iterator<E> descendingIterator() {
        return m.descendingKeySet().iterator();
    }

    // 返回TreeSet的大小
    public int size() {
        return m.size();
    }

    // 返回TreeSet是否为空
    public boolean isEmpty() {
        return m.isEmpty();
    }

    // 返回TreeSet是否包含对象(o)
    public boolean contains(Object o) {
        return m.containsKey(o);
    }

    // 添加e到TreeSet中
    public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }

    // 删除TreeSet中的对象o
    public boolean remove(Object o) {
        return m.remove(o)==PRESENT;
    }

    // 清空TreeSet
    public void clear() {
        m.clear();
    }

    // 将集合c中的全部元素添加到TreeSet中
    public  boolean addAll(Collection<? extends E> c) {
        // Use linear-time version if applicable
        if (m.size()==0 && c.size() > 0 &&
            c instanceof SortedSet &&
            m instanceof TreeMap) {
            //把C集合强制转换为SortedSet集合
            SortedSet<? extends E> set = (SortedSet<? extends E>) c; 
             //把m集合强制转换为TreeMap集合
            TreeMap<E,Object> map = (TreeMap<E, Object>) m;
            Comparator<? super E> cc = (Comparator<? super E>) set.comparator();
            Comparator<? super E> mc = map.comparator();
            //如果cc和mc两个Comparator相等
            if (cc==mc || (cc != null && cc.equals(mc))) {
            //把Collection中所有元素添加成TreeMap集合的key
                map.addAllForTreeSet(set, PRESENT);
                return true;
            }
        }
        return super.addAll(c);
    }

    // 返回子Set,实际上是通过TreeMap的subMap()实现的。
    public NavigableSet<E> subSet(E fromElement, boolean fromInclusive,
                                  E toElement,   boolean toInclusive) {
        return new TreeSet<E>(m.subMap(fromElement, fromInclusive,
                                       toElement,   toInclusive));
    }

    // 返回Set的头部,范围是:从头部到toElement。
    // inclusive是是否包含toElement的标志
    public NavigableSet<E> headSet(E toElement, boolean inclusive) {
        return new TreeSet<E>(m.headMap(toElement, inclusive));
    }

    // 返回Set的尾部,范围是:从fromElement到结尾。
    // inclusive是是否包含fromElement的标志
    public NavigableSet<E> tailSet(E fromElement, boolean inclusive) {
        return new TreeSet<E>(m.tailMap(fromElement, inclusive));
    }

    // 返回子Set。范围是:从fromElement(包括)到toElement(不包括)。
    public SortedSet<E> subSet(E fromElement, E toElement) {
        return subSet(fromElement, true, toElement, false);
    }

    // 返回Set的头部,范围是:从头部到toElement(不包括)。
    public SortedSet<E> headSet(E toElement) {
        return headSet(toElement, false);
    }

    // 返回Set的尾部,范围是:从fromElement到结尾(不包括)。
    public SortedSet<E> tailSet(E fromElement) {
        return tailSet(fromElement, true);
    }

    // 返回Set的比较器
    public Comparator<? super E> comparator() {
        return m.comparator();
    }

    // 返回Set的第一个元素
    public E first() {
        return m.firstKey();
    }

    // 返回Set的最后一个元素
    public E first() {
    public E last() {
        return m.lastKey();
    }

    // 返回Set中小于e的最大元素
    public E lower(E e) {
        return m.lowerKey(e);
    }

    // 返回Set中小于/等于e的最大元素
    public E floor(E e) {
        return m.floorKey(e);
    }

    // 返回Set中大于/等于e的最小元素
    public E ceiling(E e) {
        return m.ceilingKey(e);
    }

    // 返回Set中大于e的最小元素
    public E higher(E e) {
        return m.higherKey(e);
    }

    // 获取第一个元素,并将该元素从TreeMap中删除。
    public E pollFirst() {
        Map.Entry<E,?> e = m.pollFirstEntry();
        return (e == null)? null : e.getKey();
    }

    // 获取最后一个元素,并将该元素从TreeMap中删除。
    public E pollLast() {
        Map.Entry<E,?> e = m.pollLastEntry();
        return (e == null)? null : e.getKey();
    }

    // 克隆一个TreeSet,并返回Object对象
    public Object clone() {
        TreeSet<E> clone = null;
        try {
            clone = (TreeSet<E>) super.clone();
        } catch (CloneNotSupportedException e) {
            throw new InternalError();
        }

        clone.m = new TreeMap<E,Object>(m);
        return clone;
    }

    // java.io.Serializable的写入函数
    // 将TreeSet的“比较器、容量,所有的元素值”都写入到输出流中
    private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException {
        s.defaultWriteObject();

        // 写入比较器
        s.writeObject(m.comparator());

        // 写入容量
        s.writeInt(m.size());

        // 写入“TreeSet中的每一个元素”
        for (Iterator i=m.keySet().iterator(); i.hasNext(); )
            s.writeObject(i.next());
    }

    // java.io.Serializable的读取函数:根据写入方式读出
    // 先将TreeSet的“比较器、容量、所有的元素值”依次读出
    private void readObject(java.io.ObjectInputStream s)
        throws java.io.IOException, ClassNotFoundException {
        // Read in any hidden stuff
        s.defaultReadObject();

        // 从输入流中读取TreeSet的“比较器”
        Comparator<? super E> c = (Comparator<? super E>) s.readObject();

        TreeMap<E,Object> tm;
        if (c==null)
            tm = new TreeMap<E,Object>();
        else
            tm = new TreeMap<E,Object>(c);
        m = tm;

        // 从输入流中读取TreeSet的“容量”
        int size = s.readInt();

        // 从输入流中读取TreeSet的“全部元素”
        tm.readTreeSet(size, s, PRESENT);
    }

    // TreeSet的序列版本号
    private static final long serialVersionUID = -2479143000061671589L;
}

As can be seen from the above, the constructor of TreeSet is to create a new TreeMap as a container for actually storing Set elements. Therefore, it is concluded that the storage container actually used at the bottom of the TreeSet is the TreeMap.
For TreeMap, it uses a sorted binary tree called a "red-black tree" to save each Entry in the Map. Each Entry is treated as a node of the "red-black tree".

For the following code:

 

TreeMap<String,Double> map = new TreeMap<String,Double>();
map.put("ccc",89.0);
map.put("aaa",80.0);
map.put("bbb",89.0);
map.put("bbb",89.0);

As shown in the code above, the program puts four values ​​into the TreeMap. According to the definition of "red-black tree", the program will take the Entry "ccc,89.0" as the root node of the "red-black number", and then execute put("aaa","80.0") as the new one The node is added to the existing red-black tree. That is to say, every time a key-value pair is put into the TreeMap in the future, the system needs to treat the Entry as a new node and add it to the existing "red-black tree". In this way, it can ensure that all items in the TreeMap The keys are arranged in a certain order.

Because the bottom layer of TreeMap uses a "red-black tree" to store the entries in the collection. Therefore, the performance of adding and removing elements to TreeMap is lower than that of HashMap. When adding elements to the TreeMap, it is necessary to find the insertion position of the new Entry through a loop, because it consumes more performance. When extracting elements, it is also cost-intensive to find a suitable Entry through a loop. But it is not to say that TreeMap performance is lower than HashMap and it is useless. All entries in TreeMap are always kept in order according to the specified sorting rules by key.

Remarks: The red-black tree is a self-balancing binary search tree. The comparison value of each node in them must be greater than or equal to all nodes in its left subtree, and less than or equal to in its right subtree All nodes of the. This ensures that the red-black tree can quickly find a given value in the tree when it is operating.

Now let's observe the put(K key, V value) method of TreeMap. This method puts Entry into the Entry chain of TreeMap and maintains the order state of the Entry chain. The source code is listed below:

 

public V put(K key, V value) {
      //定义一个t来保存根元素
        Entry<K,V> t = root;
        //如果t==null,表明是一个空链表
        if (t == null) {
        //如果根节点为null,将传入的键值对构造成根节点(根节点没有父节点,所以传入的父节点为null)
            root = new Entry<K,V>(key, value, null);
            //设置该集合的size为1
            size = 1;
            //修改此时+1
            modCount++;
            return null;
        }
        // 记录比较结果
        int cmp;
        Entry<K,V> parent;
        // 分割比较器和可比较接口的处理
        Comparator<? super K> cpr = comparator;
        // 有比较器的处理,即采用定制排序
        if (cpr != null) {
            // do while实现在root为根节点移动寻找传入键值对需要插入的位置
            do {
                //使用parent上次循环后的t所引用的Entry
                // 记录将要被掺入新的键值对将要节点(即新节点的父节点)
                parent = t;
                // 使用比较器比较父节点和插入键值对的key值的大小
                cmp = cpr.compare(key, t.key);
                // 插入的key较大
                if (cmp < 0)
                    t = t.left;
                // 插入的key较小
                else if (cmp > 0)
                    t = t.right;
                // key值相等,替换并返回t节点的value(put方法结束)
                else
                    return t.setValue(value);
            } while (t != null);
        }
        // 没有比较器的处理
        else {
            // key为null抛出NullPointerException异常
            if (key == null)
                throw new NullPointerException();
            Comparable<? super K> k = (Comparable<? super K>) key;
            // 与if中的do while类似,只是比较的方式不同
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        // 没有找到key相同的节点才会有下面的操作
        // 根据传入的键值对和找到的“父节点”创建新节点
        Entry<K,V> e = new Entry<K,V>(key, value, parent);
        // 根据最后一次的判断结果确认新节点是“父节点”的左孩子还是又孩子
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        // 对加入新节点的树进行调整
        fixAfterInsertion(e);
        // 记录size和modCount
        size++;
        modCount++;
        // 因为是插入新节点,所以返回的是null
        return null;
    }

The two do...while in the above program are the key algorithms to realize the "sorted binary tree" . Whenever the program wants to add a new node, it always starts the comparison from the root node of the tree, that is, the root node is regarded as the current node.

  • If the new node is greater than the current node and the right child node of the current node exists, the right child node is taken as the current node. And keep looping
  • If the new node is smaller than the current node and the left child node of the current node exists, the left child node is taken as the current node. And keep looping
  • If the newly added node is equal to the current node, the newly added node covers the current node and ends the loop.

When TreeMap extracts value according to key, the corresponding method of TreeMap is as follows:

 

public V get(Object key) {
     //根据key取出Entry
     Entry<K,V> p = getEntry(key);
     //取出Entry所包含的value
     return (p==null ? null : p.value);
 }

Now we can know that the get(Object key) method is actually implemented by the getEntry() method. Now let's look at the source code of getEntry (Object key):

 

final Entry<K,V> getEntry(Object key) {
    // 如果有比较器,返回getEntryUsingComparator(Object key)的结果
    if (comparator != null)
        return getEntryUsingComparator(key);
    // 查找的key为null,抛出NullPointerException
    if (key == null)
        throw new NullPointerException();
    // 如果没有比较器,而是实现了可比较接口
    //将key强制转换为Comparable接口
    Comparable<? super K> k = (Comparable<? super K>) key;
    // 获取根节点
    Entry<K,V> p = root;
    // 从根节点开始对树进行遍历查找节点
    while (p != null) {
        // 把key和当前节点的key进行比较
        int cmp = k.compareTo(p.key);
        // key小于当前节点的key
        if (cmp < 0)
            // p “移动”到左节点上
            p = p.left;
        // key大于当前节点的key
        else if (cmp > 0)
        // p “移动”到右节点上
    p = p.right;
        // key值相等则当前节点就是要找的节点
        else
            // 返回找到的节点
            return p;
        }
    // 没找到则返回null
    return null;
}

The getEntry(Object obj) method also makes full use of the characteristics of the sorted binary tree to search for the target Entry. The program still starts from the root node of the binary number. If the searched node is greater than the current node, the program searches for the "right subtree", and if it is less, it searches for the "left subtree". If they are equal, the specified node has been found.

We have observed that when the TreeMap uses a custom sort. In the way of custom sorting, TreeMap uses the getEntryUsingComparator(key) method to obtain Entry based on the key.

 

final Entry<K,V> getEntryUsingComparator(Object key) {
    K k = (K) key;
    // 获取比较器
Comparator<? super K> cpr = comparator;
// 其实在调用此方法的get(Object key)中已经对比较器为null的情况进行判断,这里是防御性的判断
if (cpr != null) {
    // 获取根节点
        Entry<K,V> p = root;
        // 遍历树
        while (p != null) {
            // 获取key和当前节点的key的比较结果
            int cmp = cpr.compare(k, p.key);
            // 查找的key值较小
            if (cmp < 0)
                // p“移动”到左孩子
                p = p.left;
            // 查找的key值较大
            else if (cmp > 0)
                // p“移动”到右节点
                p = p.right;
            // key值相等
            else
                // 返回找到的节点
                return p;
        }
}
// 没找到key值对应的节点,返回null
    return null;
}

In fact, the two methods getEntry() and getEntryUsingComparator() are almost completely similar. Only the former is effective for naturally sorted TreeMap, and the latter is effective for custom sorted TreeMap.

It is not difficult to see through the above source code that the implementation of the TreeMap tool class is actually very simple. In other words, TreeMap is essentially a "red-black tree", and each Entry is a node.

to sum up

1. No duplicate elements;
2. Sorting function;
3. Elements in TreeSet must implement the Comparable interface and override the compareTo() method. This method is used by TreeSet to determine whether the elements are repeated and to determine the order of the elements;
① for the Java class libraries defined, TreeSet can be stored directly, such as String, Integer, etc., because these classes has been achieved Comparable interface);
② for custom class, if not properly treated, TreeSet only It can store an object instance of this type, otherwise it cannot be judged whether it is a duplicate.
4. Rely on TreeMap.
5. Compared with HashSet, TreeSet has the advantage of being orderly, but the disadvantage is that it is relatively slow to read. Choose different sets according to different scenarios.



Author: SnowDragonYY
link: https: //www.jianshu.com/p/12f4dbdbc652
Source: Jane books
are copyrighted by the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

Guess you like

Origin blog.csdn.net/xiaokanfuchen86/article/details/113428007