Java programmers must master 13 collections operations optimization

Firstly, for the Java collection interfaces were introduced a few, and these interfaces implementation class is described in detail, including LinkedList, ArrayList, Vector, Stack, Hashtable, HashMap, WeakHashMap etc., and then to implementation and use of the experience of some kind will be achieved explain, while focusing on WeakHashMap. I hope this paper describes, allows the reader to have some understanding of the mode of operation of the collection, precautions.

In the actual project development there will be many objects, how to efficiently and easily manage objects, become an important part of the program impact performance and maintainability. Java provides a collection of framework to solve such problems, linear table, linked lists, hash tables, are common data structures, making Java development time, JDK has provided us with a series of corresponding classes to implement basic data structures, All classes are in java.util this package, Listing 1 describes the relationship collection class.

A relationship between collections

Collection
├List
│├LinkedList
│├ArrayList
│└Vector
│ └Stack
└Set
Map
├Hashtable
├HashMap
└WeakHashMap

This article is talking about experience with the collection framework summary, note that all paper-based code JDK7.

Ⅰ. Collection interfaces

Collection Interface

Collection is the most basic set of interfaces, a Collection represents a group of Object, i.e. Collection element (Elements). Collection allows the same number of elements, support elements are sorted, others not. JDK does not provide directly inherited from the Collection of the class, provided by JDK classes are inherited from the Collection of the sub-interface, such as List and Set. All classes implement Collection interface must provide two standard constructors, no-argument constructor is used to create an empty Collection, Collection has a constructor parameter for creating a new Collection, this new Collection and Mass Collection of the same element, after a copy constructor allows a user Collection.

How to traverse each element in the Collection?

Regardless of how the actual type Collection, which support an iterator () method, which returns an iterator, iterator can use the visits to each element of each Collection. A typical usage is as follows:

Iterator it = collection.iterator();
// 获得一个迭代子
while(it.hasNext()){
    Object obj = it.next();
    // 得到下一个元素
}

Collection Interfaces derived two interfaces is List and Set.

The main methods of the Collection interface provides:

  1. boolean add (Object o) add objects to the collection;
  2. boolean remove (Object o) to delete the specified object;
  3. int size () Returns the current number of elements in the set;
  4. boolean contains (Object o) find whether there is a specified set of objects;
  5. boolean isEmpty () determines whether the set is empty;
  6. Iterator iterator () returns an iterator;
  7. boolean containsAll (Collection c) to find out whether there are elements of set C set;
  8. boolean addAll (Collection c) The set C of all the elements added to the collection;
  9. void clear () deletes all the elements in the collection;
  10. void removeAll (Collection c) deleted C and some elements in the set from the collection;
  11. void retainAll (Collection c) deleting the collection does not contain the element C from the collection.

List Interface

List is ordered Collection, use this interface to accurately control the position of insertion of each element. Users can use index (List elements in the position, similar to the array subscript) to access elements in the List, which is similar to an array of Java. Set to and below mentioned different, List allows the same elements.

Collection iterator addition to the necessary interface () method, List also provides a listIterator () method returns a ListIterator interface. Methods and standards Iterator interface compared to, ListIterator a few more add () and the like, allows you to add, delete, set elements, traversing forward or backward functions. Common class implements the List interface has LinkedList, ArrayList, Vector, and Stack et.

The main interface provides a method List:

  1. void add (int index, Object element) to add an object at the specified position;
  2. boolean addAll (int index, Collection c) C is an element added to the specified set position;
  3. Object get (int index) Returns the specified position List element;
  4. int indexOf (Object o) returns the first position O of the element appears;
  5. Object removeint (int index) Delete element at the location;
  6. Object set (int index, Object element) element by element substitution position on the element index, return substituted element.

Map Interface

No successor Map Collection interface. Key to the Value Map provides mapping Map can not contain the same Key, a map can each Key Value. Map interface provides three kinds of view of the collection, the contents of the Map Key may be set as a group, a group of Value set, or a set of Key-Value mapping.

Map of the main methods to provide:

  1. boolean equals (Object o) for comparison;
  2. boolean remove (Object o) delete an object;
  3. put(Object key,Object value) 添加 key 和 value。

RandomAccess Interface

RandomAccess interface is a sign interface itself does not provide any method, all tasks by calling the object RandomAccess interface can be considered to support fast random access to objects. The main purpose of this interface is to identify those that can support fast random access implementation of List. List any array-based implementation have achieved RaodomAccess interface, based on the realization of the list at all. Because only the array can perform fast random access, and random access to the list of needs to traverse the list. Therefore, the benefits of this interface is that you can know whether the List object being processed for fast random access, in order to perform different actions for different List, in order to improve the performance of the application.

Ⅱ. Collections Introduction

LinkedList class

LinkedList implements List interface that allows Null elements. Further additional LinkedList Get, Remove, Insert or the like operating head or tail of data LinkedList. These operations may be used as such LinkedList stack (Stack), a queue (Queue) or double-ended queue (Deque). Please note that there is no synchronization LinkedList method, it is not a thread synchronization, that is, if multiple threads access a List, you must implement their own access synchronization. One solution is to create a List when constructing a synchronous List, methods such as

List list = Collections.synchronizedList(new LinkedList(...));

ArrayList class

ArrayList implements an array of variable size. It allows all the elements, including the Null. Runtime Size, IsEmpty, Get, Set or the like is constant, but the apportionment of costs Add method constant, adding N elements requires O (N) time, other methods running time is linear.

Each ArrayList instance has a capacity (Capacity), for the size of the array of memory elements, the capacity can be increased automatically as the continually adding new elements. When the need to insert a large number of elements, the insert can be called ensureCapacity ArrayList method to increase the capacity to improve the efficiency of insertion. And LinkedList, ArrayList is non-synchronous thread (unsynchronized).

The main ArrayList methods provided:

  1. Boolean add (Object o) the specified element to the end of the list;
  2. Boolean add (int index, Object element) specified position specified added elements in the list;
  3. Boolean addAll (Collection c) is added to the end of the list specified set;
  4. Boolean addAll (int index, Collection c) added to the specified set specified position in the list;
  5. Boolean clear () to delete the list of all the elements;
  6. Boolean clone () Returns a copy of the list of examples;
  7. Boolean contains (Object o) determining whether the list contains elements;
  8. Boolean ensureCapacity (int m) to increase the capacity of the list, if necessary, the list can accommodate m elements;
  9. Object get (int index) Returns the list of elements in the specified position;
    10.Int the indexOf (Object elem) lookup index specified element in the list;
  10. Int size () returns the number of elements in the current list.

Vector class

Vector is very similar to the ArrayList, Vector is the difference between a thread synchronization. Created by Vector Iterator, although ArrayList created Iterator is the same interface, but because Vector is synchronized, when a Iterator been created and is being used by another thread Vector change the status of (for example, add or delete some element), then call when the method Iterator will throw ConcurrentModificationException, it is necessary to catch the exception.

Stack class

Stack inherited from Vector, implement a LIFO stack. Stack provide five additional approach allows Vector to be used as a stack. Push and Pop addition to the basic method, there is obtained a method Peek elements stack, the stack Empty method of detecting whether the position of a test element in the stack is empty, Search Method. Note that the Stack stack is empty just created.

Set category

Set does not contain a repeat element Collection, i.e., any two elements e1 and e2 are e1.equals (e2) = false. Set up to a null element. Clearly, there is a Set constructor constraint, the incoming Collection parameter can not contain duplicate elements. Note that you must be careful operation variable object (Mutable Object), if a variable element Set in its own state of change, which can cause some problems.

Hashtable class

Hashtable inherited Map interface, implements a hash table based on Key-Value mapping. Any object of non-empty (non-null) can be used as a Key or Value. Add data using Put (Key, Value), extracted data using Get (Key), the two basic operating expenses is constant.

Hashtable adjusted by Initial Capacity Performance and Load Factor two parameters. Usually the default Load Factor 0.75 better achieved the balance of time and space. Load Factor increased to save space but find the corresponding time will increase, will affect such operations as Get and Put. Hashtable simple example will put these three figures 1,2,3 Hashtable, and they are the Key "one", "two", "three", the code shown in Listing 2.

Two, Hashtable example

Hashtable numbers = new Hashtable();
numbers.put(“one”, new Integer(1));
numbers.put(“two”, new Integer(2));
numbers.put(“three”, new Integer(3));

If we need to remove a number, such as 2, may be removed with the corresponding key, the code shown in Listing 3.

Third, the data is read from Hastable

Integer n = (Integer)numbers.get(“two”); 
System.out.println(“two =”+ n);

Since the object as a Key Value to determine the location of the corresponding calculated by the hash function, so any object as a key must implement and HashCode Equals method. HashCode and Equals method inherited from the root class Object, if you use self-defined as a Key class, then, should be very careful, as defined by the hash function, the same as if the two objects, namely obj1.equals (obj2) = true, then their HashCode must be the same, but if the two objects are different, they are not necessarily HashCode different if two different objects of the same HashCode, a phenomenon known as conflict, conflict will lead to operating time overhead hash table increases, Therefore, as defined HashCode () method, can speed up the operation of the hash table.

If the same object has different HashCode, operation to the hash table will unexpected results (looking forward to the Get method returns Null), To avoid this problem, the best method and HashCode while replication Equals method, rather than just write One of the.

HashMap class

Hashtable HashMap and the like, except that the HashMap is asynchronous threads, and allow Null, i.e. Null Value and Null Key. However, when the HashMap considered Collection (values ​​() method returns the Collection), which is proportional to the capacity of the iterator operation time expenses and HashMap. Therefore, if the performance of iterative operations is quite important, not HashMap initial capacity is set too high, too low or Load Factor parameter.

WeakHashMap class

WeakHashMap the HashMap is an improved, its implementation Key "weak references", if Key is no longer referenced by an external, then the Key can be recovered GC.

Ⅰ. Collections Practice

ArrayList, Vector, LinkedList achieve AbstractList are from, and AbstractList directly implements List interface, and extend from AbstarctCollection. ArrayList and the Vector using an array realization, ArrayList does not provide for any thread synchronization method, and therefore not thread-safe, Vector methods are done in most of thread synchronization, a thread-safe implementation. LinkedList using the circular doubly linked list data structure, the connection is made by a series of entries, one entry always consists of three parts, the content of the element, the precursor and the rear drive entries entries.

When the ArrayList demand for capacity exceeds the size of the current array, the need for expansion. Expansion process, the array will be a lot of copy operations, and an array copy, eventually calling System.arraycopy () method. LinkedList structure due to the use of the list, there is no need to maintain the size of capacity, but each of the elements is increased need to create a new Entry object, and more assignment in frequent system calls, the performance will have some influence, continuously generate new objects in still occupy a certain amount of resources. And because the continuity of the array, thus always at the end of the element increases, only to produce an array of expansion and array replication only when the lack of space.

ArrayList is implemented based array, and the array is a contiguous memory space, if the element is inserted at an arbitrary position in the array, all elements will inevitably lead to the need to rearrange the position, and therefore poor efficiency, as the data is inserted into the tail. LinkedList insert data will not cause performance degradation.

Every effective element of the ArrayList delete operation should be carried out restructuring of the array, and delete elements of the more forward position, the greater the overhead array of reorganization, the position of the element to remove the more reliable, less overhead. LinkedList data needs to be removed in the middle of a convenient half-finished List.

Four, ArrayList and LinkedList use code

import java.util.ArrayList;
import java.util.LinkedList;
public class ArrayListandLinkedList {
    public static void main(String[] args){
        long start = System.currentTimeMillis();
        ArrayList list = new ArrayList();
        Object obj = new Object();
        for (int i=0;i<5000000;i++){
            list.add(obj);
        }
        long end = System.currentTimeMillis();
        System.out.println(end-start);
        start = System.currentTimeMillis();
        LinkedList list1 = new LinkedList();
        Object obj1 = new Object();
        for (int i=0;i<5000000;i++){
            list1.add(obj1);
        }
        end = System.currentTimeMillis();
        System.out.println(end-start);
        start = System.currentTimeMillis();
        Object obj2 = new Object();
        for (int i=0;i<1000;i++){
            list.add(0,obj2);
        }
        end = System.currentTimeMillis();
        System.out.println(end-start);
        start = System.currentTimeMillis();
        Object obj3 = new Object();
        for (int i=0;i<1000;i++){
            list1.add(obj1);
        }
        end = System.currentTimeMillis();
        System.out.println(end-start);
        start = System.currentTimeMillis();
        list.remove(0);
        end = System.currentTimeMillis();
        System.out.println(end-start);
        start = System.currentTimeMillis();
        list1.remove(250000);
        end = System.currentTimeMillis();
        System.out.println(end-start);
    }
}

6.3.5 Operating output

639
1296
6969
0
0
15

HashMap is done Key Hash algorithm, Hash value is then mapped to the memory address, direct access to data corresponding to Key. In the HashMap, using the underlying data structure is an array, i.e., a so-called memory address array subscripting. HashMap performance need to ensure the following:

  • Hash algorithm must be efficient;
  • Hash values ​​to the memory address (array index) algorithm is fast;
  • It can be obtained directly from the value corresponding to memory addresses (array index).

HashMap is actually an array of linked list. It has been introduced, based on the linked list implementation mechanism HashMap long as HashCode (), and the Hash () method to achieve good enough, conflict can be reduced as much as possible, then the operation of the HashMap is almost equivalent to the array of random access operation, having a good performance. However, if HashCode () or the Hash () method implementation is poor, in a case where a large number of conflicts, it is in fact degraded HashMap several linked list operation is equivalent to traverse the list in a HashMap, poor performance at this time.

A functional drawback is its HashMap disorder, the element is stored into the HashMap, when traversing HashMap, whose output is disordered. If you want to maintain the order of the input elements may be used alternatively LinkedHashMap.

LinkedHashMap inherited from HashMap, with high efficiency, while on the basis of the HashMap, and adds a list inside to the order of storage elements.

HashMap hash algorithm may be performed by the most rapid of Put () and Get () operation. TreeMap provides a completely different Map implementation. Functionally, TreeMap HashMap has a more powerful than the function that implements the SortedMap interface, which means that it can sort elements. TreeMap performance slightly below HashMap. If you need to sort the elements in the development, use HashMap can not achieve this function, the output will use TreeMap iterations performed in element order. LinkedHashMap element is set into the order of being accessed or ordered based on the order, TreeMap is based on the inherent order of elements (determined by the Comparator or Comparable).

LinkedHashMap are ordered according to increasing order of the elements or access, and is sorted according TreeMap Key elements.

VI demonstrates sorted using the code shown TreeMap implement business logic (TreeMap implement sorting).

import java.util.Iterator;
import java.util.Map;
import java.util.TreeMap;
public class Student implements Comparable<Student>{
    public String name;
    public int score;
    public Student(String name,int score){
        this.name = name;
        this.score = score;
    }
    @Override
    //告诉 TreeMap 如何排序
    public int compareTo(Student o) {
        // TODO Auto-generated method stub
        if(o.score<this.score){
            return 1;
        } else if(o.score>this.score){
            return -1;
        }
        return 0;
    }
    @Override
    public String toString(){
        StringBuffer sb = new StringBuffer();
        sb.append("name:");
        sb.append(name);
        sb.append(" ");
        sb.append("score:");
        sb.append(score);
        return sb.toString();
    }
    public static void main(String[] args){
        TreeMap map = new TreeMap();
        Student s1 = new Student("1",100);
        Student s2 = new Student("2",99);
        Student s3 = new Student("3",97);
        Student s4 = new Student("4",91);
        map.put(s1, new StudentDetailInfo(s1));
        map.put(s2, new StudentDetailInfo(s2));
        map.put(s3, new StudentDetailInfo(s3));
        map.put(s4, new StudentDetailInfo(s4));
        //打印分数位于 S4 和 S2 之间的人
        Map map1=((TreeMap)map).subMap(s4, s2);
        for (Iterator iterator=map1.keySet().iterator();iterator.hasNext();){
            Student key = (Student)iterator.next();
            System.out.println(key+"->"+map.get(key));
        }
        System.out.println("subMap end");
        //打印分数比 s1 低的人
        map1=((TreeMap)map).headMap(s1);
        for (Iterator iterator=map1.keySet().iterator();iterator.hasNext();){
            Student key = (Student)iterator.next();
            System.out.println(key+"->"+map.get(key));
        }
        System.out.println("subMap end");
        //打印分数比 s1 高的人
        map1=((TreeMap)map).tailMap(s1);
        for (Iterator iterator=map1.keySet().iterator();iterator.hasNext();){
            Student key = (Student)iterator.next();
            System.out.println(key+"->"+map.get(key));
        }
        System.out.println("subMap end");
    }
}
class StudentDetailInfo{
    Student s;
    public StudentDetailInfo(Student s){
        this.s = s;
    }
    @Override
    public String toString(){
        return s.name + "'s detail information";
    }
}

Seven, run output

name:4 score:91->4's detail information
name:3 score:97->3's detail information
subMap end
name:4 score:91->4's detail information
name:3 score:97->3's detail information
name:2 score:99->2's detail information
subMap end
name:1 score:100->1's detail information
subMap end

WeakHashMap feature is in addition to its own when there is a reference to Key, but if there are no other references to this Key, then the Map will automatically discard the value. As shown in Code Listing 8 Map declares two objects, a HashMap is, one is the WeakHashMap, simultaneously into A, B two two objects to the map, when deleting HashMap A, and A, B point Null, a WeakHashMap a will automatically be recovered off. The emergence of this situation because, for the object A, and when HashMap delete Null A point later, in addition to WeakHashMap also saved outside A has no pointer to the A's, so WeakHashMap will automatically discard a, and for the B target While pointing to null, but there HashMap pointer to B, and B will remain so WeakHashMap objects.

Eight, WeakHashMap sample code

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.WeakHashMap;
public class WeakHashMapTest {
    public static void main(String[] args) throws Exception {
        String a = new String("a");
        String b = new String("b");
        Map weakmap = new WeakHashMap();
        Map map = new HashMap();
        map.put(a, "aaa");
        map.put(b, "bbb");
        weakmap.put(a, "aaa");
        weakmap.put(b, "bbb");
        map.remove(a);
        a=null;
        b=null;
        System.gc();
        Iterator i = map.entrySet().iterator();
        while (i.hasNext()) {
            Map.Entry en = (Map.Entry)i.next();
            System.out.println("map:"+en.getKey()+":"+en.getValue());
        }
        Iterator j = weakmap.entrySet().iterator();
        while (j.hasNext()) {
            Map.Entry en = (Map.Entry)j.next();
            System.out.println("weakmap:"+en.getKey()+":"+en.getValue());
        }
    }
}

Nine, run output

map:b:bbb
weakmap:b:bbb

WeakHashMap primarily through its internal expungeStaleEntries this function removes unused entries, so as to achieve the purpose of automatic release memory. Basically as long as access to content WeakHashMap will call this function, so as to remove its internal entry is no longer an external reference. But if the pre-generated WeakHashMap, and in the GC before and did not visit the WeakHashMap, it is not the memory will not be released yet?

十、WeakHashMapTest1

import java.util.ArrayList;
import java.util.List;
import java.util.WeakHashMap;
public class WeakHashMapTest1 {
    public static void main(String[] args) throws Exception {
        List<WeakHashMap<byte[][], byte[][]>> maps = new ArrayList<WeakHashMap<byte[][], byte[][]>>();
        for (int i = 0; i < 1000; i++) {
            WeakHashMap<byte[][], byte[][]> d = new WeakHashMap<byte[][], byte[][]>();
            d.put(new byte[1000][1000], new byte[1000][1000]);
            maps.add(d);
            System.gc();
            System.err.println(i);
        }
    }
}

JVM code does not change any of the parameters of operation of the case shown in Listing 10, because the Java default memory is 64M, thrown out of memory error.

XI, run output

241
242
243
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at WeakHashMapTest1.main(WeakHashMapTest1.java:10)

Sure, WeakHashMap this time does not automatically help us to release unused memory. The code shown in Listing 12 will not be out of memory problems.

Twelve, WeakHashMapTest2

import java.util.ArrayList;
import java.util.List;
import java.util.WeakHashMap;
public class WeakHashMapTest2 {
    public static void main(String[] args) throws Exception {
        List<WeakHashMap<byte[][], byte[][]>> maps = new ArrayList<WeakHashMap<byte[][], byte[][]>>();
        for (int i = 0; i < 1000; i++) {
            WeakHashMap<byte[][], byte[][]> d = new WeakHashMap<byte[][], byte[][]>();
            d.put(new byte[1000][1000], new byte[1000][1000]);
            maps.add(d);
            System.gc();
            System.err.println(i);
            for (int j = 0; j < i; j++) {
                System.err.println(j + " size" + maps.get(j).size());
            }
        }
    }
}

It was found that this test run normal output, memory overflow problem no longer occurs.

Overall, WeakHashMap not you doing anything it can automatically release unused objects inside, but do not release the object inside at the time you access its contents.

WeakHashMap achieve weak reference, because its Entry <K, V> inherited from WeakReference <K>, and

In WeakHashMap $ Entry <K, V> class definition and the constructor function Listing 13.

Thirteen, WeakHashMap class definition

private static class Entry<K,V> extends WeakReference<K> 
implements Map.Entry<K,V>
 Entry(K key, V value, ReferenceQueue<K> 
queue,int hash, Entry<K,V> next) {
    super(key, queue);
    this.value = value;
    this.hash = hash;
    this.next = next;
}

Please note that it is the parent class constructor statement: "super (key, queue);", is passed Key, Key is therefore a weak reference, Value is directly associated with strong references in this.value. In System.gc (), Byte array Key in were recovered, while still maintaining Value (Value is strongly linked to the Entry, Entry and association in a Map, Map association in the ArrayList).

For loop every time a new New WeakHashMap, after the Put operation, Byte arrays although the GC will WeakReference in Key recovered and event notification to the ReferenceQueue, but did not follow the appropriate action to trigger WeakHashMap to deal ReferenceQueue Therefore WeakReference package remains in a WeakHashMap Key, the corresponding value of course exist.

That value is when it is cleared? 10 on the list and the list of 11 two examples of program analysis shows, maps.get Listing 11 (j) .size () triggered the recovery of Value, and then what triggered it? View source WeakHashMap seen, Size method calls expungeStaleEntries method for the JVM to be recovered Entry (Quene in) traversal, and Entry of Value blank, recovered memory. So the effect is Key is cleared when the GC, Value access WeakHashMap after Key Clear is cleared.

WeakHashMap thread class is not synchronized, the method may be used to construct the synchronization Collections.synchronizedMap WeakHashMap, each key object is stored as indirectly indicating weak object references. Thus, after either outside or in the map, only weak references cleared a key in a garbage collector in the map, the key will be automatically removed. It should be noted that the value of the object WeakHashMap held by ordinary strong references. Therefore care should be taken to ensure that the value of the object is not strong references directly or indirectly own key, because it prevents dropped keys. Note that the value of the object may be indirectly by reference in their corresponding key WeakHashMap itself, that is, a value object may reference some other strong key object, and the value of the object associated with the key object refers to the first turn, a strong the value of key objects.

A method of handling this problem is, in the package in WeakReferences value itself, as before inserting: m.put (key, new WeakReference (value)), then, were coated with solutions get, all such "collection view methods "return iterators are fail-fast, after the iterator is created, if the map is structurally modified, except through the iterator's own Remove or Add method, any other time in any way modified, the iterator will throw a ConcurrentModificationException. Therefore, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary deterministic behavior at an undetermined time in the future.

Note that we can not ensure that the iterator does not fail, in general, there is a concurrent modification of sync, it is impossible to make any guarantee fully determined.

to sum up

Comprehensive preceding description and example code, we can know if it comes to operating stacks, queues, etc., you should consider the use of List. The need to quickly insert, delete elements and other operations, you should use LinkedList. If you need fast random access to elements, you should use ArrayList. If the program, or access only in a thread in a single-threaded environment, consider the asynchronous class, the higher the efficiency. If multiple threads may also operate a class, the class should be used to synchronize. Pay special attention to the operation of the hash table, as a Key object to correct replication and HashCode Equals method. Try to return to the interface instead of the actual type, such as return List rather than ArrayList, so that if you later need to be replaced ArrayList LinkedList, client code need not change, this is the program for abstract thought.

This article is only for sharing application level, follow-up article will be for the achievement of specific depth look at the source code level, will implement specific algorithm is based on an in-depth description, please readers attention to the need to follow-up article.

————END————

Java programmers must master 13 collections operations optimization

Guess you like

Origin blog.51cto.com/14409778/2420847