TreeSet 和 HashSet如何实现添加无重复对象（源码分析）

首先，我们来对比的说一下set集合和list集合，list集合就好比女生的衣柜，女生的衣柜都非常的整齐，因为女生大多爱好购物，一般衣柜里面有几件相同的衣服，所以list集合的特点就是有序，可以包含重复的元素，有序就是按顺序输出，下面我们来说一下set集合，set集合就好比是一篮鸡蛋，你想呀，一篮子鸡蛋，肯定没有两个相同的鸡蛋，而且，由于鸡蛋的形态，所以，这些鸡蛋都不是很整齐的排放，所以，set集合的特点就是，无序，且集合中没有相同的元素

下面，让我来说一下，set集合是通过什么实现的集合没有相同的元素的，以TreeSet集合为例子,

1 首先科普一下，我们查看一下TreeSet集合中的构造方法

  public TreeSet() {
        this(new TreeMap<E,Object>());
    }

可以清楚的看到，这个集合的底层是一个TreeMap集合

2 我们查看TreeSet集合中的add方法

 public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }
我们查看PRESENT，发现是这个东西

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    通过上面的构造方法，我们可以证实，Set集合的本质是map，map在存储值的时候，
   它是以键值对的形式进行存储的，通过观察，我们可以发现，它put的时候，
   把对象放在了map键值对中的键上，那么那个值是什么，通过代码的解读，
认为他是判断是否有重复的元素的，也就是一个boolean。

既然我们知道了，它是把对象放在了TreeMap的键上，那么我们就看一下TreeMap集合是怎么实现键的唯一性的，我们看一下TreeMap 的put方法

 /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     *
     * @return the previous value associated with {@code key}, or
     *         {@code null} if there was no mapping for {@code key}.
     *         (A {@code null} return can also indicate that the map
     *         previously associated {@code null} with {@code key}.)
     * @throws ClassCastException if the specified key cannot be compared
     *         with the keys currently in the map
     * @throws NullPointerException if the specified key is null
     *         and this map uses natural ordering, or its comparator
     *         does not permit null keys
     */


 public V put(K key, V value) {
        Entry<K,V> t = root;
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)
                throw new NullPointerException();
            Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }

不要慌，看我怎么和你解释这个方法，首先讲解一下，TreeMap的底层是一个红黑树（平衡二叉树），讲解一下红色的代码，红色的代码就是在创建一个根节点，我们会可以看一下root是怎么定义的

private transient Entry<K,V> root = null; 在根据资料进行查询，root就是集合中的根节点，然后，当该集合为空的时候，也就是还没有添加元素，所以这个时候，不需要进行相应的比较，直接添加过去就好，下面我来说一下，在进行比较的时候，有两种比较的方法（① 通过创建自定义比较器进行比较，在创建TreeSet的时候，把相应的的比较器传入 ②使用自然排序进行比较），假设，我们这里使用的是自然排序进行比较，也就是蓝色的代码，他也就是使用自然排序进行比较，具体比较的方法，看下面的图片

让我来讲解一下，假设我添加的是一些数字，当我在添加第一个数字20的时候，还没有根节点，这个时候直接添加，然后我们继续执行添加操作，这个时候，添加到了18号元素，我们看Integer的compareTo方法

compareTo

public int compareTo(Integer anotherInteger)

在数字上比较两个 Integer 对象。

指定者：

接口 Comparable<Integer> 中的 compareTo

参数：

anotherInteger - 要比较的 Integer。

返回：

如果该 Integer 等于 Integer 参数，则返回 0 值；如果该 Integer 在数字上小于 Integer 参数，则返回小于 0 的值；如果 Integer 在数字上大于 Integer 参数，则返回大于 0 的值（有符号的比较）。

从以下版本开始：

1.2

从而得知，如果插入的元素比父节点小，返回一个负数，插入父节点在左边，如果比父节点大，返回正数，插入在父节点右边，如果相等的时候，返回0 ，这个时候也就是两个数字相等，也就是此时的两个Integer对象是相等的时候，我们查看TreeSet里面执行添加的源码在CompareTo返回为0的时候，它是这样进行操纵的return t.setValue(value); 也就是说他没有执行插入的操作，他还是直接保存了原来的节点，我们在查看一下，相应的取出顺序，按着左中右的方式进行取出，想要查看更加详细的存储和取出方式，推荐资料 http://shmilyaw-hotmail-com.iteye.com/blog/1836431 https://www.ibm.com/developerworks/cn/java/j-lo-tree/index.html

接下来，我们说一下通过自定义比较器来鉴别是否是重复的元素

TreeSet<Student> ts = new TreeSet<Student>(new Comparator<Student>() {
            @Override
            public int compare(Student s1, Student s2) {
                // 姓名长度
                int num = s1.getName().length() - s2.getName().length();
                // 姓名内容
                int num2 = num == 0 ? s1.getName().compareTo(s2.getName())
                        : num;
                // 年龄
                int num3 = num2 == 0 ? s1.getAge() - s2.getAge() : num2;
                return num3;
            }
        });

通过返回的值，来判断值的大小。

下面，我们讲解一下HaseSet比较对象是否相同的原因，我们查看HashSet对象源码

1 首先看HashSet 的源码

   public HashSet() {
        map = new HashMap<>();
    }

2 然后查看一下HashSet 的add方法

   public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

3 同理，我们查看HashMap 的 put方法

  public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

我们在查看一下 Hash 方法，

   final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

  transient int hashSeed = 0;

首先查看该对象对应的Hash值然后在哈希表中查找hash值，里面的参数是需要查询的Hash值和hash表的长度

int i = indexFor(hash, table.length); 
这个就是在Hash表中查找他的哈希值

然后

for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

首先for循环的作用就是对象的Hash值去比较Hash表中的所有对象的Hash值，如果Hash值不等的时候，&&运算，直接跳过，否则，通过equals方法进行比较，如果两个方法都相等

那么，此时两个对象就一定是相同的对象。

TreeSet 和 HashSet如何实现添加无重复对象（源码分析）

compareTo

猜你喜欢