Understand Java collections in one article, no longer have to worry about the operation of collection data types

1. Introduction

The definition and function of collection

A Java collection is a data structure used to store and manipulate a set of objects. It provides a set of interfaces and classes for working with different types of collection data, such as lists, sets, maps, etc.

The main functions of Java collections are:

  1. Storing objects: Java collections can store different types of objects and provide convenient methods to add, delete and access elements in the collection.

  2. Manage data: collections provide a variety of data structures, such as lists, sets, maps, etc., and you can choose an appropriate data structure to manage data according to different needs. For example, lists can store data in the order of insertion, sets can guarantee the uniqueness of elements, maps can store and access data through key-value pairs, etc.

  3. Provide algorithms and operations: Java collections provide a series of algorithms and operations, such as sorting, searching, filtering, etc., which can easily process and operate the elements in the collection.

  4. Improve efficiency: The data structure and algorithm in the collection class are optimized to improve the execution efficiency of the program. For example, when using HashSet to store a large amount of data, you can quickly find elements through the hash algorithm without traversing the entire collection.

Basic concepts of the collection framework

The Java Collections Framework is a set of classes and interfaces provided by the Java programming language for storing, manipulating, and processing collections of data. It provides various types of data structures such as lists, sets, maps, etc., along with algorithms and tools for manipulating and manipulating these data structures. The Java Collections Framework is designed to provide high-performance, reliable, and scalable data structures and algorithms.

The basic concepts of the Java collection framework are as follows:

  1. Interface (Interface): Many interfaces are defined in the Java collection framework, such as List, Set, Map, etc. The interface defines a set of methods for operating collections, which is the core part of the collection framework.

  2. Class (Class): The Java collection framework provides concrete classes that implement interfaces, such as ArrayList, HashSet, HashMap, etc. These classes implement the methods defined by the interface and provide specific data structures and algorithms.

  3. List (List): A list is an ordered collection that can contain repeated elements. The Java collection framework provides multiple list implementation classes, such as ArrayList, LinkedList, and so on.

  4. Set: A set is a collection of unordered, non-repeating elements. The Java collection framework provides multiple collection implementation classes, such as HashSet, TreeSet, and so on.

  5. Map (Map): A map is a collection of key-value pairs, each key corresponds to a value. The Java collection framework provides multiple mapping implementation classes, such as HashMap, TreeMap, and so on.

  6. Queue: A queue is a first-in-first-out (FIFO) data structure. The Java collection framework provides multiple queue implementation classes, such as LinkedList, PriorityQueue, and so on.

  7. Iterator (Iterator): Iterators are used to traverse the elements in the collection, providing a unified way to access the elements of the collection. Through an iterator, each element in the collection can be accessed in turn.

  8. Generics: The Java Collections Framework uses generics to increase type safety and code readability. With generics, you can specify the type of elements stored in the collection.

  9. Algorithm: The Java collection framework provides many algorithms and tool classes for sorting, searching, filtering, and other operations on collections. These algorithms and utility classes make it easy to manipulate the elements in a collection.

The Java collection framework is widely used, and it provides a wealth of data structures and algorithms to meet various programming needs.

Interface and implementation classes of the collection framework

The Java collection framework provides some core interfaces and related implementation classes for storing and manipulating different types of collection data.

  1. Collection interface:

    • List interface: an ordered, repeatable collection whose elements can be accessed by index. Common implementation classes are ArrayList, LinkedList, and Vector.
    • Set interface: An unordered, non-repeatable collection that does not allow storing the same elements. Common implementation classes are HashSet, TreeSet, LinkedHashSet.
  2. Map interface:

    • HashMap class: An unordered collection of key-value pairs, stored and accessed according to the key. Commonly used to quickly find and store related data.
    • TreeMap class: An ordered collection of key-value pairs, stored and accessed according to the natural order of the keys or the order defined by a specific comparator.
    • LinkedHashMap class: An ordered collection of key-value pairs, stored and accessed in insertion order or access order.
  3. Queue interface:

    • LinkedList class: implements the data structure of the queue and supports the first-in-first-out (FIFO) element access method.
  4. Stack interface:

    • Stack class: Inherited from the Vector class, it implements the last-in-first-out (LIFO) element access method.
  5. Interface tool class:

    • Collections class: Provides a set of static methods for common operations on collection objects, such as sorting, searching, reversing, etc.

In addition to the above interfaces and implementation classes, the Java collection framework also provides other interfaces and implementation classes for specific collection requirements. For example:

  • Deque interface: Double-ended queue, which can insert and delete elements at both ends.
  • PriorityQueue class: a priority queue, which is accessed in the order of elements, and elements with higher priority are accessed first.
  • BitSet class: a collection of bits for storing and manipulating bit information.

The inheritance structure of the collection framework

  1. Collection interface: It is the basic interface of all collections, which defines the basic operations on collection elements. It inherits from the Iterable interface, indicating that it supports iterative access.

  2. List interface: Inherited from the Collection interface, it represents an ordered and repeatable collection. The List interface provides methods for accessing and manipulating collection elements by index. Common implementation classes are ArrayList, LinkedList, and Vector.

  3. Set interface: Inherited from the Collection interface, it represents an unordered and non-repeatable collection. The Set interface does not allow storing identical elements. Common implementation classes are HashSet, TreeSet, LinkedHashSet.

  4. Queue interface: Inherited from the Collection interface, it represents a special collection used to implement the data structure of the queue (first in, first out). The Queue interface provides methods for adding, removing and inspecting elements. Common implementation classes are LinkedList and PriorityQueue.

  5. Deque interface: Inherited from the Queue interface, it represents a double-ended queue, which can insert and delete elements at both ends. The Deque interface provides methods for inserting and deleting operations at the head and tail of the queue. Common implementation classes are LinkedList and ArrayDeque.

  6. Map interface: represents a collection of key-value pairs, each key is unique. The Map interface defines methods that operate on key-value pairs. Common implementation classes are HashMap, TreeMap, and LinkedHashMap.

In addition to the above-mentioned basic interfaces, there are some sub-interfaces and implementation classes inherited from the above-mentioned interfaces, which are used to extend the functions of the collection framework.

Two, List collection

The interface and implementation class of the List collection

List is one of the interfaces in the Java collection framework, which inherits from the Collection interface. List represents an ordered and repeatable collection, each element has a corresponding index, and elements can be accessed and manipulated through the index.

  1. List interface:

    • Method: The List interface inherits from the Collection interface. In addition to the methods of the Collection interface, it also provides some unique methods, such as operating elements according to the index, obtaining sublists, and so on.
    • Implementation class: The main implementation classes of the List interface include ArrayList, LinkedList and Vector.
  2. ArrayList class:

    • The List interface is implemented, and the bottom layer uses arrays to store elements, supporting dynamic expansion.
    • Advantages: Random access to elements is fast and efficient; it is suitable for frequent random access and traversal operations.
    • Disadvantages: The efficiency of inserting and deleting elements is relatively low, and a large number of elements need to be moved.
    • Thread unsafe: ArrayList is not thread safe. If multi-threaded operation is required, it is recommended to use the thread-safe implementation class Vector or the synchronizedList method of the Collections tool class for packaging.
  3. LinkedList class:

    • The List and Deque interfaces are implemented, and the bottom layer uses a doubly linked list to store elements.
    • Advantages: The efficiency of inserting and deleting elements is high, and there is no need to move a large number of elements; it supports efficient head-to-tail operations.
    • Disadvantages: Random access to elements is relatively inefficient.
    • It is better to use LinkedList in scenarios that require frequent insertion and deletion of elements.
  4. Vector class:

    • The List interface is implemented, and the bottom layer also uses an array to store elements, similar to ArrayList.
    • Advantages: thread-safe, can be used in a multi-threaded environment; supports dynamic expansion.
    • Disadvantages: Compared with ArrayList, the performance is slightly worse.
    • Note: Vector is an earlier collection implementation class. In modern development, it is generally recommended to use ArrayList or LinkedList.

The difference and advantages and disadvantages of ArrayList and LinkedList

ArrayList and LinkedList are two common implementation classes of the List interface, which differ in underlying data structure and performance characteristics.

ArrayList:

  • The underlying data structure: ArrayList is implemented using an array, and maintains an array of Object type internally, with a default initial capacity of 10.
  • advantage:
    • Fast random access to elements: Because it is implemented based on an array, elements can be directly accessed through indexes, and the search and read efficiency is high.
    • High space efficiency: Compared with the linked list structure, the array has a continuous memory space and does not require additional storage space.
    • High efficiency of iterative operation: High efficiency of iterative operation and traversal, suitable for frequent random access and traversal operations.
  • shortcoming:
    • Inserting and deleting elements is less efficient: Inserting and deleting elements requires moving a lot of elements, especially when operating in the middle or beginning of the array.
    • Dynamic expansion overhead: When elements exceed the current capacity, dynamic expansion is required, which may lead to performance degradation.

LinkedList:

  • The underlying data structure: LinkedList is implemented using a doubly linked list, and each node contains front and back pointers and element values.
  • advantage:
    • High efficiency of inserting and deleting elements: Due to the linked list structure, inserting and deleting elements only needs to adjust the pointers of adjacent nodes, which is more efficient.
    • Support efficient head and tail operations: the insertion and deletion operations on the head and tail elements are extremely efficient.
  • shortcoming:
    • Random access to elements is relatively inefficient: to find an element at a specific index requires traversing the linked list from the beginning or the end, and the time complexity is O(n).
    • Requires more storage space: Compared with arrays, linked lists require additional pointers to maintain the connection relationship between nodes.

Choosing ArrayList or LinkedList depends on specific application scenarios and operational requirements:

  • If frequent random access, read, and traversal operations are required, but fewer insertion and deletion operations, you can choose ArrayList, which has higher access and read performance.
  • If you need to perform frequent insertion and deletion operations, especially at the head or tail of the linked list, it is more appropriate to choose LinkedList, which has higher insertion and deletion performance.
  • If a multi-threaded environment is involved, it is recommended to use a thread-safe Vector or use the synchronizedList method of the Collections tool class to wrap ArrayList/LinkedList.

Common operations and methods of List collection

  1. Add element operation:

    • boolean add(E element): Add the specified element to the end of the list.
    • void add(int index, E element): Insert the specified element at the specified index position.
    • boolean addAll(Collection<? extends E> c): Adds all elements in the specified collection to the end of the list.
    • boolean addAll(int index, Collection<? extends E> c): Insert all elements in the specified collection at the specified index position.
  2. Delete element operation:

    • boolean remove(Object o): Removes the first occurrence of the specified element from the list.
    • E remove(int index): Delete the element at the specified index position.
    • boolean removeAll(Collection<?> c): Removes from the list all elements that are the same as those in the specified collection.
    • void clear(): Clear all elements in the list.
  3. Get element operations:

    • E get(int index): Returns the element at the specified index position.
    • int indexOf(Object o): Returns the index of the first occurrence of the specified element.
    • int lastIndexOf(Object o): Returns the index of the last occurrence of the specified element.
    • List<E> subList(int fromIndex, int toIndex): Returns a sublist of the specified range in the list.
  4. Modify element operations:

    • E set(int index, E element): Replace the element at the specified index position with the specified element.
  5. Check if an element exists:

    • boolean contains(Object o): Determine whether the list contains the specified element.
    • boolean containsAll(Collection<?> c): Determine whether the list contains all elements in the specified collection.
  6. Judging collection size and empty judgment:

    • int size(): Returns the number of elements in the list.
    • boolean isEmpty(): Check if the list is empty.
  7. Traverse operation:

    • Use a for-each loop to iterate over the elements in the list.
    • Use iterator (Iterator) to traverse the elements in the list.

How to traverse the List collection

  1. Use for loop to traverse:

    for (int i = 0; i < list.size(); i++) {
          
          
        E element = list.get(i);
        // 对元素进行操作
    }
    

    The elements in the list can be accessed one by one through the index, and corresponding operations can be performed.

  2. Traversing with an enhanced for loop (foreach loop):

    for (E element : list) {
          
          
        // 对元素进行操作
    }
    

    The enhanced for loop can directly traverse the elements in the list without using the index.

  3. Use iterator (Iterator) to traverse:

    Iterator<E> iterator = list.iterator();
    while (iterator.hasNext()) {
          
          
        E element = iterator.next();
        // 对元素进行操作
    }
    

    iterator()Get the iterator object by calling the method, use hasNext()the method to judge whether there is a next element, and use next()the method to get the next element for operation.

  4. Traversing using the Stream API (Java 8 and above):

    list.stream().forEach(element -> {
          
          
        // 对元素进行操作
    });
    

    Use the Stream API's stream()method to convert the list to a stream, and then use forEach()the method to operate on each element in the stream.

3. Set collection

The interface and implementation class of the Set collection

The Set interface is part of the Java collections framework, which extends the Collection interface. Set represents a collection that does not allow duplicate elements.

The features of the Set interface mainly include:

  1. Duplicates are not allowed: elements in a Set are unique, and if you try to add duplicate elements to a Set, the addition operation will fail.
  2. Unordered: The elements in a Set have no fixed order. Different implementation classes may store and iterate elements in different ways, so the access order and insertion order of elements cannot be guaranteed to be consistent.
  3. Allow storage of null elements: Set can store a null element (except for certain implementation classes).

Common Set implementation classes include:

  • HashSet : Implemented based on a hash table. HashSet uses HashMap to store elements, and uses the hash code of the element for indexing, so the time complexity of adding, deleting and finding operations is O(1). It is the most commonly used Set implementation class, but does not guarantee the order of elements.
  • LinkedHashSet : Inherited from HashSet, based on hash table and linked list. It maintains the insertion order of elements through the linked list, so elements can be traversed in an orderly manner. Performance is slightly lower than HashSet, but still more efficient.
  • TreeSet : based on red-black tree implementation. The elements in the TreeSet will be sorted according to the natural order (or the order defined by the Comparator interface). Therefore, TreeSet can provide a series of ordered collection operations.
  • EnumSet : A Set collection specially used to store enumeration types. Since the values ​​of the enumeration type are limited and fixed, EnumSet is internally implemented using bit vectors, which has extremely high efficiency.

It should be noted that the Set interface itself cannot be instantiated directly, and a specific implementation class must be used to create an object. The sample code is as follows:

Set<String> set = new HashSet<>();  // 创建一个 HashSet 对象

According to requirements and specific scenarios, select the appropriate Set implementation class. HashSet is the most commonly used Set implementation class, suitable for most situations. If you need to maintain insertion order or sort according to natural order, you can choose LinkedHashSet or TreeSet. EnumSet is the best choice when you need to store elements of enumeration type because it is efficient.

The difference and advantages and disadvantages of HashSet and TreeSet

  1. The internal implementation mechanism is different :

    • HashSet uses a hash table (implemented by HashMap) to store elements, indexed according to the hash code of the element. Therefore, HashSet has fast insertion, deletion, and lookup operations with a time complexity of O(1).
    • TreeSet stores elements based on a red-black tree (implemented by TreeMap), and it sorts elements according to their natural order (or the order defined by the Comparator interface). Therefore, TreeSet provides some ordered collection operations, such as range search and sorting.
  2. Element order :

    • The elements in HashSet have no specific order, and the access order and insertion order of elements are not guaranteed to be the same. Because HashSet uses a hash table to store data, the position of an element in the hash table is determined by the hash code of the element, not the order of insertion.
    • The elements in a TreeSet are ordered, sorted according to the natural order of the elements. Or define the sorting rules through the imported Comparator interface.
  3. Performance :

    • The average time complexity of insertion, deletion and lookup operations of HashSet is O(1). However, when the hash table collides, the performance will suffer, and in extreme cases, the worst time complexity may reach O(n).
    • The time complexity of insertion, deletion, and lookup operations of a TreeSet is O(log n), where n is the number of elements in the TreeSet. Due to the self-balancing nature of the red-black tree, it can maintain relatively stable performance.
  4. Cases where null elements are allowed to be stored :

    • HashSet allows storing one null element (only one) because of the special way null values ​​are handled in hash tables.
    • TreeSet does not allow storing null elements because it needs to sort the elements and null values ​​cannot be compared.

To sum up, HashSet is suitable for scenarios that require fast insertion, deletion, and search, and does not care about the order of elements. TreeSet is suitable for scenarios that require ordered collection operations, such as range search and sorting, but due to the characteristics of red-black trees, its performance is relatively low. At the same time, please note that both HashSet and TreeSet are not thread-safe. If they are used in a multi-thread environment, they need to be packaged by the synchronizedSet method provided by the Collections tool class.

Common operations and methods of Set collection

  1. Add element :

    • boolean add(E element): Add the specified element to the Set. If the element already exists in the Set, the add operation fails and false is returned.
    • Sample code:
      Set<String> set = new HashSet<>();
      set.add("apple");
      set.add("banana");
      
  2. Remove elements :

    • boolean remove(Object o): Remove the specified element from the Set. The remove operation succeeds and returns true if the element exists in the Set.
    • Sample code:
      set.remove("apple");
      
  3. Check if element exists :

    • boolean contains(Object o): Determine whether the Set contains the specified element. Returns true if the element exists in the Set.
    • Sample code:
      boolean contains = set.contains("banana");
      
  4. Get the number of elements :

    • int size(): Returns the number of elements in the Set.
    • Sample code:
      int size = set.size();
      
  5. Determine whether it is an empty set :

    • boolean isEmpty(): Determine whether the Set is an empty set. Returns true if the Set contains no elements.
    • Sample code:
      boolean isEmpty = set.isEmpty();
      
  6. Empty the collection :

    • void clear(): Empty all elements in the Set, making it an empty set.
    • Sample code:
      set.clear();
      
  7. iterators :

    • Iterator<E> iterator(): Returns an iterator for traversing the elements in the Set.
    • Sample code:
      Iterator<String> iterator = set.iterator();
      while (iterator.hasNext()) {
              
              
          String element = iterator.next();
          // 处理元素
      }
      

It should be noted that the Set collection does not provide a method to access elements by index, because the elements in the Set have no fixed order.

In addition, the Set interface also inherits some methods defined in the Collection interface, such as addAll(), removeAll(), retainAll(), containsAll()etc., for operations between sets.

How to traverse the Set collection

  1. Iterator traversal :
    An iterator is a general method for traversing the elements of a collection, allowing the collection to be modified during the traversal. The iterator of the Set collection iterator()is obtained by calling the method, and then the elements are traversed using hasNext()the and next()method.

    Sample code:

    Set<String> set = new HashSet<>();
    // 添加元素...
    
    Iterator<String> iterator = set.iterator();
    while (iterator.hasNext()) {
          
          
        String element = iterator.next();
        // 处理元素
    }
    
  2. Enhanced for loop traversal :
    The enhanced for loop is a simplified way of traversing the elements of a collection, suitable for traversing the entire collection. It doesn't require explicit use of iterators, just specify a variable to receive each element.

    Sample code:

    Set<String> set = new HashSet<>();
    // 添加元素...
    
    for (String element : set) {
          
          
        // 处理元素
    }
    
  3. forEach() method traversal (Java 8+) :
    The forEach() method introduced in Java 8 provides a more concise way of traversal, which can use Lambda expressions to process each element.

    Sample code:

    Set<String> set = new HashSet<>();
    // 添加元素...
    
    set.forEach(element -> {
          
          
        // 处理元素
    });
    

No matter which traversal method is used, it can be used to traverse all elements in the Set collection. It should be noted that during the traversal process, do not modify the content of the collection, otherwise the traversal may fail or an exception may occur. In addition, the elements in the Set collection have no fixed order, so there is no guarantee that the order of traversal is the same as the order of addition.

Fourth, the Map collection

The interface and implementation class of the Map collection

Map is an interface in the Java collection framework for storing key-value pairs, which provides a way to map keys to values. In Map, keys are unique, but values ​​can be repeated.

Commonly used Map implementation classes:

  1. HashMap :
    HashMap is the most commonly used Map implementation class. It is implemented based on a hash table and allows null to be used as keys and values. HashMap does not guarantee the traversal order and has high performance.

  2. TreeMap :
    TreeMap is implemented based on a red-black tree, and the keys are sorted in their natural order or in the order of a custom comparator. TreeMap does not allow null as a key, but allows null as a value.

  3. LinkedHashMap :
    LinkedHashMap inherits from HashMap, and the underlying layer is implemented using a hash table and a doubly linked list. It preserves the order in which elements were inserted and can be iterated using access order.

  4. HashTable :
    HashTable is an early Map implementation class based on a hash table. It does not allow null as a key or value, is thread safe but has poor performance and is generally not recommended.

The difference and advantages and disadvantages of HashMap and TreeMap

HashMap:

  • Features:

    • Based on the hash table implementation, the hash value of the key is used for fast lookup and storage.
    • The traversal order is not guaranteed, i.e. the storage order of the elements has nothing to do with the insertion order.
    • Null is allowed as key and value.
    • Not thread safe.
  • advantage:

    • In most cases, the operation time complexity of HashMap is O(1), and the average performance is high.
    • It is suitable for scenarios that do not need to care about the order of elements, such as quickly finding, storing, and deleting elements.
    • The memory footprint is relatively small.
  • shortcoming:

    • The traversal order of elements is not guaranteed, so it is not possible to iterate in insertion order or key order.
    • HashMap is not thread-safe. If it needs to be used in a multi-threaded environment, external synchronization is required.

TreeMap:

  • Features:

    • Based on red-black tree implementation, sort by the natural order of the keys or a custom comparator.
    • The ordered state maintenance element of the key supports query and traversal of the key range.
    • Null is not allowed as a key, but is allowed as a value.
    • Not thread safe.
  • advantage:

    • TreeMap guarantees the order of elements, and can iterate according to the natural order of keys or the order of custom comparators.
    • Applicable to scenarios that require traversal and range query in the order of keys.
    • Some additional methods are provided, such as firstKey(), lastKey(), etc.
  • shortcoming:

    • The time complexity of inserting, deleting, and searching operations is O(logN), and the performance is slightly lower than that of HashMap.
    • The memory usage is relatively large.

Choosing to use HashMap or TreeMap depends on the actual application scenario:

  • If you require high performance for element insertion, deletion, and search operations, and you don't care about the order of elements, you can choose to use HashMap.
  • If you need to perform operations such as traversal and range query according to the order of the keys, or need to sort according to the natural order of the keys, you can choose to use TreeMap.
  • In a multi-threaded environment, thread safety needs to be considered. ConcurrentHashMap can be used instead of HashMap or ConcurrentSkipListMap can be used instead of TreeMap.

Common operations and methods of Map collection

  1. Add and modify elements:

    • V put(K key, V value): Add the specified key-value pair to the Map. If the key already exists, the original value will be overwritten and the old value will be returned.
    • void putAll(Map<? extends K, ? extends V> map): Add all key-value pairs in the specified Map to the current Map.
  2. Get element:

    • V get(Object key): Returns the value corresponding to the specified key, or null if the key does not exist.
    • V getOrDefault(Object key, V defaultValue): Returns the value corresponding to the specified key, or the default value if the key does not exist.
    • boolean containsKey(Object key): Determine whether the specified key is contained in the Map.
    • boolean containsValue(Object value): Determine whether the specified value is contained in the Map.
  3. Remove elements:

    • V remove(Object key): Remove the key-value pair in the Map according to the key, and return the removed value.
    • void clear(): Clear all key-value pairs in the Map.
  4. Get key, value and number of key-value pairs:

    • Set<K> keySet(): Returns the Set collection composed of all keys in the Map.
    • Collection<V> values(): Returns a collection of all values ​​in the Map.
    • Set<Map.Entry<K, V>> entrySet(): Returns a Set collection containing all key-value pairs.
    • int size(): Returns the number of key-value pairs in the Map.
    • boolean isEmpty(): Determine whether the Map is empty.
  5. Other common methods:

    • boolean equals(Object obj): Determine whether the current Map is equal to the specified object.
    • int hashCode(): Returns the hash code value of the current Map.
    • void replaceAll(BiFunction<? super K, ? super V, ? extends V> function): Replace each key-value pair with the specified function.
    • void forEach(BiConsumer<? super K, ? super V> action): Perform the specified action on each key-value pair.
    • V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction): Calculates the value for the given key using the specified function and stores the result back into the Map.
    • boolean replace(K key, V oldValue, V newValue): Replace the value for the given key, only if the old value is equal to the specified value.

How to traverse the Map collection

  1. Use keySet() to iterate over key-value pairs:

    Map<K, V> map = ...;
    for (K key : map.keySet()) {
          
          
        V value = map.get(key);
        // 使用 key 和 value 进行操作
    }
    

    keySet()Obtain the Set composed of all the keys in the Map through the method, then use the enhanced for loop to traverse the keys, and then obtain the corresponding values ​​through the keys.

  2. Use entrySet() to iterate over key-value pairs:

    Map<K, V> map = ...;
    for (Map.Entry<K, V> entry : map.entrySet()) {
          
          
        K key = entry.getKey();
        V value = entry.getValue();
        // 使用 key 和 value 进行操作
    }
    

    entrySet()Get the Set containing all key-value pairs through the method, and each key-value pair is Map.Entry<K, V>represented by , you can get the key and value in turn for operation.

  3. Use the forEach() method to iterate over key-value pairs:

    Map<K, V> map = ...;
    map.forEach((key, value) -> {
          
          
        // 使用 key 和 value 进行操作
    });
    

    In Java 8 and above, the Map interface provides forEach()a method, which can directly pass in a Lambda expression to traverse each key-value pair.

  4. Use an iterator to iterate over key-value pairs:

    Map<K, V> map = ...;
    Iterator<Map.Entry<K, V>> iterator = map.entrySet().iterator();
    while (iterator.hasNext()) {
          
          
        Map.Entry<K, V> entry = iterator.next();
        K key = entry.getKey();
        V value = entry.getValue();
        // 使用 key 和 value 进行操作
    }
    

    Get the Set through entrySet()the method, and use to iterator()get the iterator to traverse, and then get the key and value of each key-value pair in turn.

When traversing the Map collection, you can choose an appropriate traversal method according to specific needs. It should be noted that during the traversal process, do not modify the structure of the Map collection, otherwise ConcurrentModificationExceptionan exception may be thrown. If there is a need to modify, you can use the iterator remove()method to delete the currently traversed key-value pair.

Five, Queue collection

The interface and implementation class of the Queue collection

Queue is an interface used to represent a queue (first in, first out) in the Java collection framework, and provides methods for element operations at both ends of the queue.

In Java, the commonly used implementation classes of the Queue interface are:

  1. LinkedList : LinkedList implements the Queue interface and provides a wealth of queue operation methods. It is based on a linked list structure, which can efficiently add and delete elements, and is suitable for frequent insertion and deletion operations.
  2. ArrayDeque : ArrayDeque implements the Queue interface, which is implemented internally using a dynamic array. It has the ability to add and remove elements efficiently, and supports double-ended operations, which can add and remove elements at both ends of the queue.
  3. PriorityQueue : PriorityQueue is a priority queue class that implements the Queue interface in the Java collection framework. It is an ordered queue, sorted according to the priority of the elements.

These implementation classes are not thread-safe. If you need to use queues in a multi-threaded environment, you can use thread-safe implementation classes such as or ConcurrentLinkedQueue.ArrayBlockingQueue

Common operations and methods of Queue

  1. Add element:

    • boolean add(E e): Add the specified element to the tail of the queue, throwing an exception if the queue is full.
    • boolean offer(E e): Add the specified element to the end of the queue, or return false if the queue is full.
  2. Remove element:

    • E remove(): Remove and return the element at the head of the queue, or throw an exception if the queue is empty.
    • E poll(): Remove and return the element at the head of the queue, or null if the queue is empty.
  3. Get the header element:

    • E element(): Returns the element at the head of the queue, but does not remove it, and throws an exception if the queue is empty.
    • E peek(): Returns the element at the head of the queue without removing it, or returns null if the queue is empty.
  4. Check if the queue is empty:

    • boolean isEmpty(): Determine whether the queue is empty.
  5. Get the queue size:

    • int size(): Returns the number of elements in the queue.

The Queue interface inherits the Collection interface, and has special definitions for adding, deleting, and getting elements to meet the characteristics of the queue.

Commonly used Queue implementation classes such as LinkedList and ArrayDeque also provide some other methods:

  • boolean contains(Object o): Determine whether the queue contains the specified element.
  • void clear(): Clear all elements in the queue.
  • Iterator<E> iterator(): Returns the iterator of the queue elements, which can be used to traverse the elements in the queue.

It should be noted that the Queue interface does not provide a method for directly accessing the middle elements of the queue, because the operation of the queue should follow the principle of first in first out, and can only be operated at both ends of the queue.

Features and usage of PriorityQueue

features
  1. Priority sorting: PriorityQueue will determine the priority of elements according to the natural order of elements or through a custom comparator. By default, elements are considered comparable and sorted in ascending order. But you can also specify the collation of elements through the constructor or using a comparator.

  2. Underlying data structure: The underlying layer of PriorityQueue is implemented using an array (an array in the form of a complete binary tree), which makes PriorityQueue have better performance when inserting and deleting elements.

  3. No duplicate elements: PriorityQueue does not allow duplicate elements to be added. If you try to add an element that already exists, the addition operation will not be performed, but the queue will remain unchanged.

  4. Provide queue operation methods: PriorityQueue implements the Queue interface and provides common queue operation methods, including adding, deleting, and obtaining elements.

  5. Does not guarantee iteration order: PriorityQueue does not guarantee that elements will be returned in priority order when iterating. Even when insertions are prioritized, iterations may return elements stored in a different order.

  6. Fast access to the smallest element: PriorityQueue provides peek()the method for getting the highest priority element in the queue without removing it. This makes it possible to efficiently fetch the smallest element in the queue.

  7. Applicable to scenarios such as task scheduling and event sequencing: The characteristics of PriorityQueue make it especially suitable for scenarios such as task scheduling and event sequencing. By adding tasks or events to PriorityQueue according to priority and taking them out of the queue for execution, tasks or events with high priority can be processed first.

  • If the element does not implement Comparablethe interface, you need to pass in a custom comparator to specify the sorting rules of the elements when creating the PriorityQueue.
  • When using a custom comparator, you need to ensure that the logic of the comparator is correct to prevent PriorityQueue from operating in the wrong order.
usage
  1. Create a PriorityQueue: You can create an empty PriorityQueue with a no-argument constructor. By default, elements are sorted in natural order. You can also use the constructor with a comparator parameter to create a PriorityQueue specifying the collation.

    // 创建一个空的 PriorityQueue,默认按照自然顺序排序
    PriorityQueue<Integer> pq1 = new PriorityQueue<>();
    
    // 创建一个指定排序规则的 PriorityQueue,按照降序排序
    PriorityQueue<Integer> pq2 = new PriorityQueue<>(Collections.reverseOrder());
    
    // 创建一个指定排序规则的 PriorityQueue,使用自定义比较器
    PriorityQueue<Integer> pq3 = new PriorityQueue<>((a, b) -> b - a);
    
  2. Add elements: Use add()the or offer()method to add elements to PriorityQueue.

    pq1.add(5);
    pq1.offer(10);
    pq1.add(3);
    
  3. Get the first element of the queue: Use peek()the method to get the element with the highest priority in the PriorityQueue without deleting it.

    Integer highestPriorityElement = pq1.peek();
    
  4. Delete the first element of the queue: Use poll()the method to delete and return the element with the highest priority in PriorityQueue.

    Integer highestPriorityElement = pq1.poll();
    
  5. Check whether the queue is empty: Use isEmpty()the method to determine whether the PriorityQueue is empty.

    boolean isQueueEmpty = pq1.isEmpty();
    
  6. Get the queue size: use size()the method to get the number of elements in PriorityQueue.

    int queueSize = pq1.size();
    
  7. Iterate through elements: Since PriorityQueue does not guarantee to iterate elements in priority order, if elements need to be traversed, it is recommended to use iterators or convert them to arrays.

    // 使用迭代器遍历元素
    Iterator<Integer> iterator = pq1.iterator();
    while (iterator.hasNext()) {
          
          
        Integer element = iterator.next();
        // 处理元素
    }
    
    // 转换为数组后遍历元素
    Integer[] arr = pq1.toArray(new Integer[0]);
    for (Integer element : arr) {
          
          
        // 处理元素
    }
    

It should be noted that PriorityQueue does not support modification operations (such as updating elements at specified positions), because the design goal of PriorityQueue is to process elements according to priority. If a modification operation is required, it may be necessary to delete the element first and then add the modified element.

6. Collections tool class

Common methods of the Collections class

The Collections class is a utility class provided by the Java collection framework, which contains a large number of static methods for operating and processing collections.

  1. Sort operation:

    • sort(List<T> list): Sort the specified List in ascending order, requiring elements to implement the Comparable interface.
    • sort(List<T> list, Comparator<? super T> c): Sort the specified List and use a custom Comparator to define the comparison rules between elements.
    • reverse(List<?> list): Reverse the order of the elements in the specified List.
  2. Find and replace operations:

    • binarySearch(List<? extends Comparable<? super T>> list, T key): Use the binary search algorithm to find the specified element in the specified List, and return the index value of the element.
    • binarySearch(List<? extends T> list, T key, Comparator<? super T> c): Use the custom comparator to perform binary search in the specified List, and return the index value of the element.
    • replaceAll(List<T> list, T oldVal, T newVal): Replace all elements equal to oldVal in List with newVal.
  3. Synchronized security operations:

    • synchronizedList(List<T> list): Returns a thread-safe List, and all modification operations on the List are atomic operations.
    • synchronizedSet(Set<T> s): Returns a thread-safe Set, and all modification operations on the Set are atomic operations.
    • synchronizedMap(Map<K,V> m): Returns a thread-safe Map, and all modifications to the Map are atomic operations.
  4. Immutable collection operations:

    • unmodifiableList(List<? extends T> list): Returns an unmodifiable List, any modification operation will throw an UnsupportedOperationException exception.
    • unmodifiableSet(Set<? extends T> s): Returns an unmodifiable Set.
    • unmodifiableMap(Map<? extends K,? extends V> m): Returns an unmodifiable Map.
  5. Other common methods:

    • max(Collection<? extends T> coll): Return the largest element in the specified Collection, and the elements must implement the Comparable interface.
    • min(Collection<? extends T> coll): Return the smallest element in the specified Collection, and the elements must implement the Comparable interface.
    • isEmpty(Collection<?> coll): Determine whether the specified Collection is empty.
    • reverseOrder(): Returns a comparator for sorting in descending order.
    • shuffle(List<?> list): Randomly shuffle the order of the elements in the specified List.

Sorting method of Collections class

The Collections class provides various sorting methods to sort the collection. These sorting methods can be used to sort collections such as Lists, Sets, and arrays.

  1. sort(List list):

    • Method signature:public static <T extends Comparable<? super T>> void sort(List<T> list)
    • Description: Sort the specified List in ascending order, requiring elements to implement the Comparable interface, that is, to be comparable.
    • Example:
      List<Integer> numbers = new ArrayList<>(List.of(5, 2, 8, 1, 9));
      Collections.sort(numbers);
      System.out.println(numbers);  // 输出:[1, 2, 5, 8, 9]
      
  2. sort(List list, Comparator<? super T> c):

    • Method signature:public static <T> void sort(List<T> list, Comparator<? super T> c)
    • Description: Sort the specified List and use a custom Comparator to define the comparison rules between elements.
    • Example:
      List<String> names = new ArrayList<>(List.of("Alice", "Bob", "Charlie", "David"));
      Collections.sort(names, (a, b) -> a.length() - b.length());
      System.out.println(names);  // 输出:[Bob, Alice, David, Charlie]
      
  3. reverse(List<?> list):

    • Method signature:public static void reverse(List<?> list)
    • Description: Reverses the order of the elements in the specified List.
    • Example:
      List<Integer> numbers = new ArrayList<>(List.of(1, 2, 3, 4, 5));
      Collections.reverse(numbers);
      System.out.println(numbers);  // 输出:[5, 4, 3, 2, 1]
      
  4. shuffle(List<?> list):

    • Method signature:public static void shuffle(List<?> list)
    • Description: Randomly shuffle the order of the elements in the specified List.
    • Example:
      List<String> names = new ArrayList<>(List.of("Alice", "Bob", "Charlie", "David"));
      Collections.shuffle(names);
      System.out.println(names);  // 输出类似:[David, Alice, Bob, Charlie]
      

The sorting operation in the above method will directly modify the original collection, so it is recommended to create a copy of the collection before sorting to preserve the original data. In addition, the time complexity of these sorting methods depends on the specific sorting algorithm, generally O(n log n). If the collection is large, consider using the parallel sorting methods sort​(List<T> list)and sort​(List<T> list, Comparator<? super T> c)to improve sorting speed.

Find and replace methods of the Collections class

  1. binarySearch(List<? extends Comparable<? super T>> list, T key):

    • Method signature:public static <T extends Comparable<? super T>> int binarySearch(List<? extends T> list, T key)
    • Description: Use the binary search algorithm to find the specified element in the specified sorted List, and return the index value of the element. If the element does not exist, a negative number is returned indicating where the element should be inserted.
    • Example:
      List<Integer> numbers = new ArrayList<>(List.of(1, 3, 5, 7, 9));
      int index = Collections.binarySearch(numbers, 5);
      System.out.println(index);  // 输出:2
      
  2. binarySearch(List<? extends T> list, T key, Comparator<? super T> c):

    • Method signature:public static <T> int binarySearch(List<? extends T> list, T key, Comparator<? super T> c)
    • Description: Use the custom Comparator to perform binary search in the specified ordered List, and return the index value of the element.
    • Example:
      List<String> names = new ArrayList<>(List.of("Alice", "Bob", "Charlie", "David"));
      int index = Collections.binarySearch(names, "Charlie", (a, b) -> a.length() - b.length());
      System.out.println(index);  // 输出:1
      
  3. replaceAll(List list, T oldVal, T newVal):

    • Method signature:public static <T> boolean replaceAll(List<T> list, T oldVal, T newVal)
    • Description: Replace all elements in List equal to oldVal with newVal. Returns true if a replacement occurred, false otherwise.
    • Example:
      List<Integer> numbers = new ArrayList<>(List.of(1, 2, 2, 3, 2));
      boolean replaced = Collections.replaceAll(numbers, 2, 7);
      System.out.println(numbers);    // 输出:[1, 7, 7, 3, 7]
      System.out.println(replaced);   // 输出:true
      

These find and replace methods are convenient for operating on collections. The binary search method is especially useful for sorted collections, and can quickly locate the position of an element. The replace method can be used to replace a specific element in a collection with a new element. It should be noted that the binary search method requires that the collection must be ordered, while the replacement method will directly modify the original collection.

Seven, collection performance analysis

Time Complexity Analysis of Sets

The time complexity analysis of collections is a performance metric for evaluating collection operations, which describes the growth of computing resources required for operations as the collection size increases.

  1. Add element:

    • ArrayList, LinkedList: The time complexity of adding an element at the end is O(1), but the time complexity of inserting an element at the beginning or in the middle is O(n).
    • HashSet, LinkedHashSet: The time complexity of adding elements is O(1), regardless of hash collisions.
    • TreeSet: The time complexity of adding elements is O(log n), which will be balanced according to the sorting of elements.
    • HashMap, LinkedHashMap: The time complexity of adding key-value pairs is O(1), regardless of hash collisions.
  2. Remove element:

    • ArrayList, LinkedList: The time complexity of deleting an element is O(n), because other elements need to be moved to fill the deleted position.
    • HashSet, LinkedHashSet: The time complexity of deleting elements is O(1), regardless of the hash collision.
    • TreeSet: The time complexity of deleting elements is O(log n), which will be balanced according to the sorting of elements.
    • HashMap, LinkedHashMap: The time complexity of deleting key-value pairs is O(1), regardless of hash collisions.
  3. Find element:

    • ArrayList, LinkedList: The time complexity of finding elements by index value is O(1), but the time complexity of finding elements by value is O(n).
    • HashSet, LinkedHashSet: The time complexity of finding elements is O(1), regardless of hash collisions.
    • TreeSet: The time complexity of finding elements by value is O(log n).
    • HashMap, LinkedHashMap: The time complexity of finding a value through a key is O(1), regardless of hash collisions.
  4. Sort by:

    • ArrayList: The time complexity of sorting using the Collections.sort() method is O(n log n).
    • LinkedList: The time complexity of sorting using the Collections.sort() method is O(n log n).
    • TreeSet: Automatically balance according to the sorting of elements, and the time complexity of inserting and accessing elements is O(log n).
    • TreeMap: Automatically balance according to the sorting of keys, and the time complexity of inserting and accessing key-value pairs is O(log n).

It should be noted that the above time complexity analysis is based on worst-case performance estimation and ignores some special cases (such as hash collisions).

Space Complexity Analysis of Sets

The space complexity analysis of a collection is to evaluate the growth of the memory space occupied by the collection as the number of elements increases.

  1. ArrayList:

    • Space complexity depends on the number of elements and the capacity of the underlying array.
    • When the number of elements is less than the capacity of the underlying array, the space complexity is O(n), where n is the number of elements.
    • When the number of elements exceeds the capacity of the underlying array, a new array will be created to expand the capacity. The length of the new array is usually 1.5 times the length of the original array, so the space complexity is O(n).
  2. LinkedList:

    • The space complexity depends on the number of elements, and each node needs to store additional references to the previous and next nodes.
    • Each node consumes additional space, so the space complexity is O(n), where n is the number of elements.
  3. HashSet、LinkedHashSet:

    • The space complexity depends on the number of elements and the capacity of the hash table.
    • The capacity of the hash table is automatically adjusted according to the number of elements, which is usually twice or more than the number of elements.
    • In the absence of hash collisions, the space complexity is O(n), where n is the number of elements.
  4. TreeSet:

    • Space complexity depends on the number of elements.
    • TreeSet is internally implemented using a red-black tree, and each node needs to additionally store references to the parent node, left child node, and right child node.
    • Each node consumes additional space, so the space complexity is O(n), where n is the number of elements.
  5. HashMap、LinkedHashMap:

    • The space complexity depends on the number of key-value pairs and the capacity of the hash table.
    • The capacity of the hash table is automatically adjusted according to the number of key-value pairs, usually twice or more than the number of key-value pairs.
    • In the absence of hash collisions, the space complexity is O(n), where n is the number of key-value pairs.

It should be noted that the above space complexity analysis is performed without considering the fixed overhead of the collection itself. For example, both ArrayList and LinkedList have additional pointer or index overhead, and data structures such as hash tables and red-black trees also require some space to store structural information.

Collection Performance Optimization Methods

  1. Select the appropriate collection type: Select the most appropriate collection type according to specific needs, considering factors such as collection characteristics, operating frequency, and data volume. Different collection types have different performance characteristics for different operations, for example, ArrayList is suitable for frequent random access, while LinkedList is suitable for frequent insertion and deletion operations.

  2. Initialize collection capacity: For collections that need to be expanded, such as ArrayList, HashSet, and HashMap, you can specify their initial capacity when creating the collection. Estimate the number of elements according to actual needs, and set a reasonable initial capacity to avoid frequent internal array or hash table expansion operations, thereby improving performance.

  3. Use iterators to traverse collections: When traversing collections, try to use iterators instead of traversing through indexes. Iterators provide a more efficient way to traverse, especially in collections such as LinkedList that require sequential access to elements.

  4. Note the equals() and hashCode() methods of collections: When using custom objects as collection elements, make sure to implement the equals() and hashCode() methods correctly. In this way, the performance advantages of the hash table can be maintained in operations such as search and deletion of the collection, and hash collisions and inefficient linear searches can be avoided.

  5. Use the methods provided by the collection API: the collection framework provides many efficient operation methods, such as addAll(), removeAll(), retainAll(), etc. According to specific needs, try to use these methods instead of manual iterative operations, which can improve the simplicity and performance of the code.

  6. Consider concurrency safety: If you use collections in a multi-threaded environment, you need to consider concurrency safety. You can choose to use thread-safe collection classes, such as ConcurrentHashMap, CopyOnWriteArrayList, etc., or ensure thread safety through an appropriate synchronization mechanism.

  7. Avoid frequent collection operations: Frequent collection operations may cause performance degradation, especially in the case of large-scale data. If possible, try to batch operations or one-time operation collections to reduce the number of individual operations, thereby improving performance.

  8. Avoid unnecessary boxing and unboxing: When using generic collections, avoid frequent autoboxing and unboxing operations. You can use special collection classes of basic data types, such as IntList (for integers), LongList (for long integers), etc., or use native collection types in third-party libraries such as Google Guava.

In general, the performance optimization of collections needs to be combined with specific scenarios and requirements. The efficiency and response speed of collection operations can be improved by selecting appropriate collection types, optimizing operation methods, and considering factors such as concurrency safety. If you encounter specific performance problems, you can also use performance analysis tools to locate and optimize them.

Guess you like

Origin blog.csdn.net/u012581020/article/details/131643127