Java核心—第一章集合

Java中集合类相关的操作被经常的使用，在这一章我们主要描述对集合类的概念、使用、实现和区别。

一、概述

Java集合主要包括两个部分：Collection和Map

（1） Collection

概念：表示一组对象，这些对象也成为Cllection的元素。它主要包含两个部分：Set和List。

类关系图：

Set：元素无放入顺序，元素不可重复（其位置由HashCode决定，也是固定的）。Set之下主要有三种实现HashSet、LinkedHashSet、TreeSet。

下面是Set相关的类图：

List：元素按照放入顺序，元素可重复。List之下主要有ArrayList、Vector、LinkedList。

下面是List相关类图：

（2） Map

概念：Map接口中保存键和值的映射，可以通过键来获取值。其中主要包含HashMap、LinkedHashMap、Hashtable。

下面是Map相关类图：

二、实现类详细描述和使用

Map：

（1）HashMap

特点：

1. HashMap是最常用的Map，它以Key-Value的形式保存数据，根据Key得HashCode值存储数据

2. 遍历时，不会按照插入顺序排序，而会按照HashCode所对应的的下标进行排序；

3. HashMap最多只允许一条Key为Null，允许多条记录值为Null

4. HashMap是线程非安全的，如果需要同步可以考虑使用Hashtable、ConcurrentHashMap、Collections的synchronizedMap方法

5. HashMap 继承于AbstractMap，实现了Map、Cloneable、java.io.Serializable接口

HashMap的数据结构：

HashMap主要是由数组和链表组成，是由数组作为HashCode的索引，然后链表保存相同HashCode的元素。

其元素数据结构主要如下：

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        int hash;
}

HashMap的重要参数：负载因子

概念：负载因子loadFactor衡量的是一个散列表的空间的使用程度

公式：threshold = (int)(capacity * loadFactor);

HashMap中默认值为0.75，其值越大表明对空间利用率越高，例如为2时，理想情况下每个索引都会包含2个元素，但相应的会降低查询的效率，因为查询时要遍历该索引下的链表；其值越小表明对空间利用率越低，但查询速度快。

HashMap扩容

当数据越来越多时，每个HashCode索引发生碰撞的概率也会增加，所在在超过threshold阈值时会发生扩容，将原来散列表扩容2倍后生成一个新的散列表，将原有的数据重新计算Hash值，进行复制，所以十分消耗性能。所以应在知道数据多少的情况下，初始化HashMap大小。

Fail-Fast机制

概念：我们知道java.util.HashMap不是线程安全的，因此如果在使用迭代器的过程中有其他线程修改了map，那么将抛出ConcurrentModificationException异常。

遍历方式

（1）使用Entry遍历

Map map = new HashMap();
Iterator iter = map.entrySet().iterator();
while (iter.hasNext()) {
　　Map.Entry entry = (Map.Entry) iter.next();
　　Object key = entry.getKey();
　　Object val = entry.getValue();
}

（2）使用Key遍历

Map map = new HashMap();
Iterator iter = map.keySet().iterator();
while (iter.hasNext()) {
　　Object key = iter.next();
　　Object val = map.get(key);
}

建议选择Entry遍历方式，效率更高。

排序方式：

构造数据，代码如下：

Map<Integer, Integer> map = new HashMap<Integer, Integer>();
map.put(1, 3);
map.put(10, 7);
map.put(4, 9);
map.put(20, 6);
Set<Map.Entry<Integer, Integer>> set = map.entrySet();
System.out.println("默认排序：");
for(Map.Entry<Integer, Integer> entry : set) {
    System.out.println("key:" + entry.getKey() + ",value:" + entry.getValue());
}

1. 按照Key排序，使用TreeSet情况

//按照Key排序，使用TreeMap
Map<Integer, Integer> keyMap = new TreeMap<Integer, Integer>(new Comparator<Integer>() {
    public int compare(Integer o1, Integer o2) {
        //倒叙排序
        return o2.compareTo(o1);
    }
});
keyMap.putAll(map);
System.out.println("TreeMap,按照key降序排序：");
for(Map.Entry<Integer, Integer> entry : keyMap.entrySet()) {
    System.out.println("key:" + entry.getKey() + ",value:" + entry.getValue());
}

TreeMap,按照key降序排序：
key:20,value:6
key:10,value:7
key:4,value:9
key:1,value:3

2.按照Key排序，使用ArrayList

//按照Key排序，使用ArrayList
List<Map.Entry<Integer, Integer>> list = new ArrayList<Map.Entry<Integer, Integer>>(map.entrySet());
Collections.sort(list, new Comparator<Map.Entry<Integer, Integer>>() {
    public int compare(Map.Entry<Integer, Integer> o1, Map.Entry<Integer, Integer> o2) {
        return o1.getKey().compareTo(o2.getKey());
    }
});
System.out.println("ArrayList,按照key升序排序：");
for(Map.Entry<Integer, Integer> entry : list) {
    System.out.println("key:" + entry.getKey() + ",value:" + entry.getValue());
}

ArrayList,按照key升序排序：
key:1,value:3
key:4,value:9
key:10,value:7
key:20,value:6

3. 按照Value排序:

//按照Value降序
List<Map.Entry<Integer, Integer>> listValue = new ArrayList<Map.Entry<Integer, Integer>>(map.entrySet());
Collections.sort(listValue, new Comparator<Map.Entry<Integer, Integer>>() {
    public int compare(Map.Entry<Integer, Integer> o1, Map.Entry<Integer, Integer> o2) {
        return o2.getValue().compareTo(o1.getValue());
    }
});
System.out.println("ArrayList,按照value降序排序：");
for(Map.Entry<Integer, Integer> entry : listValue) {
    System.out.println("key:" + entry.getKey() + ",value:" + entry.getValue());
}

ArrayList,按照value降序排序：
key:4,value:9
key:10,value:7
key:20,value:6
key:1,value:3

（2）LinkedHashMap

特点：

1. 遍历时，按照插入顺序或访问顺序排序

2. LinkedHashMap最多只允许一条Key为Null，允许多条记录值为Null

3. LinkedHashMap是线程非安全的

4. LinkedHashMap继承自HashMap，所以具备其所有特性

LinkedHashMap的数据结构：

LinkedHashMap在HashMap的基础上，增加了双向链表的数据结构，会保存插入的顺序。

其数据结构如下：

private static class Entry<K,V> extends HashMap.Entry<K,V> {
    // These fields comprise the doubly linked list used for iteration.
    Entry<K,V> before, after;
}

认识AccessOrder：

该参数AccessOrder是布尔类型，是作为初始化LinkedHashMap时的一个参数。作用是当执行put方法时，遇到相同key时，false是按照原key位置排序，true是按照key的新位置排序，会删除原有位置的引用，在双向链表头部插入引用。

为false时，按照插入顺序排序，例子如下：

Map<Integer, Integer> linkedHashMap = new LinkedHashMap<Integer, Integer>(16, 0.75f, false);
linkedHashMap.put(1, 10);
linkedHashMap.put(2, 10);
linkedHashMap.put(3, 10);
linkedHashMap.put(4, 10);
linkedHashMap.put(1, 100);
Iterator<Map.Entry<Integer, Integer>> iterator = linkedHashMap.entrySet().iterator();
System.out.println("为false时，按照插入顺序排序:");
while (iterator.hasNext()) {
    Map.Entry<Integer, Integer> next = iterator.next();
    System.out.println("key:" + next.getKey() + ",value:" + next.getValue());
}

为false时，按照插入顺序排序:
key:1,value:100
key:2,value:10
key:3,value:10
key:4,value:10

为true时，按照操作顺序排序：

Map<Integer, Integer> linkedHashMapTrue = new LinkedHashMap<Integer, Integer>(16, 0.75f, true);
linkedHashMapTrue.put(1, 10);
linkedHashMapTrue.put(2, 10);
linkedHashMapTrue.put(3, 10);
linkedHashMapTrue.put(4, 10);
linkedHashMapTrue.put(1, 100);
Iterator<Map.Entry<Integer, Integer>> iteratorTrue = linkedHashMapTrue.entrySet().iterator();
System.out.println("为true时，按照操作顺序排序:");
while (iteratorTrue.hasNext()) {
    Map.Entry<Integer, Integer> next = iteratorTrue.next();
    System.out.println("key:" + next.getKey() + ",value:" + next.getValue());
}

为true时，按照操作顺序排序:
key:2,value:10
key:3,value:10
key:4,value:10
key:1,value:100

（3）ConcurrentHashMap

特点：

1. Key和Value均不允许为null

2. 线程安全，内部使用了分段锁，相当于内部实现了多个Hashtable，提升了并发访问的性能

3. 循环遍历时，不会按照插入顺序排序，会按照hashCode所对应下标进行排序

ConcurrentHashMap的数据结构：

主要由数组和链表组成，其数据结构如下：

static final class HashEntry<K,V> {
    final int hash;
    final K key;
    volatile V value;
    volatile HashEntry<K,V> next;
}

（4）Hashtable

1. 线程安全，使用synchronized关键字实现同步，所以同一时间只有一个线程可以操作Hashtable

2. Key和Value均不允许为null

3. 循环遍历时，不会按照插入顺序排序，会按照hashCode所对应下标进行排序

（5）WeakHashMap

1. 线程非安全

2. Key和Value均可为null

3. 循环遍历时，不会按照插入顺序排序，会按照hashCode所对应下标进行排序

4. WeakHashMap使用弱引用，使用WeakReference和ReferenceQueue实现，也就是说当内存空间不足时，内部的元素会被GC回收，因此你可以使用它作为缓存使用

回收例子如下：

WeakHashMap<Integer, Integer> weakHashMap = new WeakHashMap<Integer, Integer>();
weakHashMap.put(new Integer(1000), 1);
weakHashMap.put(new Integer(10001), 2);
System.out.println(weakHashMap.size());
System.gc();
System.runFinalization();
System.out.println(weakHashMap.size());

（6）IdentityHashMap

1. 线程非安全

2. Key和Value均可为null

3. 循环遍历时，不会按照插入顺序排序，会按照hashCode所对应下标进行排序

4. 只有当Key1和Key2严格相等时(Key1 == Key2)，才认为Key1和Key2相等。一般情况下当eqauls相等、HashCode相等时，就认为Key1和Key2相等。

严格相等例子如下：

Map<Integer, Integer> map = new IdentityHashMap<Integer, Integer>();
map.put(new Integer(1), 1000);
map.put(new Integer(1), 1000);
Iterator<Map.Entry<Integer, Integer>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
    Map.Entry<Integer, Integer> next = iterator.next();
    System.out.println("key:" + next.getKey() + ",value:" + next.getValue());
}

（7）TreeMap

1. Key和Value均不允许为null

2. 线程非安全

3. 无序，不会按照插入顺序，但可以按照自然顺序或者自定义顺序排序

TreeMap的数据结构：

内部的数据结构使用红黑数，是一种平衡二叉查询树，在插入和删除时需要通过左旋、右旋、颜色反转来重新平衡生成红黑数，其查询时间负载读为O(logn)。红黑数主要特点如下：

1. 每个结点都只能是红色或者黑色中的一种

2. 根结点是黑色的

3. 每个叶结点（NIL节点，空节点）是黑色的

4. 如果一个结点是红的，则它两个子节点都是黑的。也就是说在一条路径上不能出现相邻的两个红色结点

5. 从任一结点到其每个叶子的所有路径都包含相同数目的黑色结点

6. 任何节点都会大于等于左节点，小于等于右节点

static final class Entry<K,V> implements Map.Entry<K,V> {
    K key;
    V value;
    Entry<K,V> left = null;
    Entry<K,V> right = null;
    Entry<K,V> parent;
    boolean color = BLACK;
}

Set：

（1）HashSet

1. 元素可以为null

2. 非线程安全

3. 无序

4. 元素不可以重复

HashSet的数据结构：

HashSet是基于HashMap实现的，在构造方法中会生成一个HashMap，其仅使用Key，而Value则保存内部的一个Object对象。

（2）LinkedHashSet

1. 有序，按照插入顺序或者访问顺序

2. 非线程安全

3. 元素允许为null

4. 元素不可以重复

LinkedHashSet的数据结构：

LinkedHashSet是基于LinkedHashMap实现的，内部保存一个LinkedHashMap的对象，所以具有其相似的性质。

（3）TreeSet

1. Key和Value均不允许为null

2. 线程非安全

3. 无序，不会按照插入顺序排序，会按照自然顺序或自定义顺序排序

4. 元素不可重复

TreeSet的数据结构：

TreeSet是基于TreeMap实现的，内部保存一个TreeMap的对象，所以具有有其相似的性质。

List：

（1）ArrayList

1. 元素允许为null

2. 线程非安全

3. 有序，按照插入顺序

4. 元素可以重复

5. get、set、size等方法为常数，add、remove方法为O(n)，也就是说get、set操作速度快，但是add、remove速度较慢，尤其发生扩容时，因为会发生数组复制所以效率较差，其扩容会增加原数组长度的50%

ArrayList的内部数据结构：

private transient Object[] elementData;

（2）LinkedList

1. 元素允许为null

2. 线程非安全

3. 有序，按照插入顺序

4. 元素可以重复

5. get、set方法为O(n)，remove、add方法时间复杂度为O(1)，内部使用链表作为数据结构，两头插入和删除较快，读取较慢，可用其实现队列、堆栈

LinkedList的内部数据结构：

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;
}

（3）Vector

1. 元素允许为null

2. 线程安全，只允许单线程操作，所以有额外的同步开销

3. 有序，按照插入顺序

4. 元素可以重复

5. get、set方法为O(1)，remove、add方法时间复杂度为O(1)，内部使用数组作为数据结构，读取较快，插入较慢。与ArrayList相比，其扩容为2倍

（4）Stack

1. 元素允许为null

2. 线程安全

3. 有序，按照插入顺序

4. 元素可以重复

5. 依托Vector实现，在基础上实现了push、pop、peek、search等方法