Article Directory
1 Overview
Set
The definition of the interface is very simple. It is essentially one Collection
, but requires that the collection cannot have duplicate elements . In other words, if an attempt is made to add an element Set
to a that already exists Set
in the , add
the method returns false
and Set
the is itself unchanged.
Java Set
provides several main implementations for interfaces:
HashSet
: Based on the implementation of the hash tableSet
, it does not guarantee the iteration order of the collection ; in particular, it does not guarantee that the order is constant.LinkedHashSet
: Implemented by hash tables and linked listsHashSet
, with predictable iteration order .TreeSet
: A tree-based (red-black tree) implementation, sortedSet
according to the natural order of the elements , or sorted according to the comparator provided when creating the collection .
2.HashSet
There are several implementations of interfaces in Java
Set
, but the most commonly used is undoubtedlyHashSet
. We often use it by default.
HashSet
It is Set
a basic implementation of the interface and is widely used in various programs.
Its class diagram is as follows:
According to its source code name, we can see that HashSet
it is actually HashMap
supported by , HashMap
which is based on the implementation of the hash table. This data structure design makes it HashSet
have excellent access and search performance .
A hash table is a data structure that provides fast element insertion and lookup operations . In HashSet
, the position of the element in the hash table is determined by the hash algorithm. This means that no matter HashSet
how many elements are in , the time to determine whether an element exists (and to get it) is roughly constant O(1), which is the HashSet
main source of efficient performance.
HashSet
The criterion for determining the equality of two elements is: hashCode()
the return values of the methods of the two objects are equal, and equals()
the return results of the methods of the two objects are also equal. hashCode()
method is used to determine the position of the element in the hash table , and equals()
the method is used to compare the actual value of the element in the event of a hash collision . This means that if you are going to store your own objects in HashSet
, you should override these two methods to ensure that they behave as required by HashSet
.
Here's a simple example of overriding the hashCode()
and equals()
methods:
public class MyDate {
private int year;
private int month;
private int day;
@Override
public boolean equals(Object o){
System.out.println("调用equals()方法");
// 如果对象地址一样,则认为相同
if (this == o) return true;
// 如果参数为空,或者类型信息不一样,则认为不同
if (!(o instanceof MyDate)) return false;
// 转换为当前类型
MyDate myDate = (MyDate) o;
// 使用 == 比较基本类型,使用 equals 比较引用类型(此处没有必要)
return year == myDate.year && month == myDate.month && day == myDate.day;
}
@Override
public int hashCode(){
System.out.println("调用hashCode()方法");
// Objects类的hash方法返回一个int类型的值,作为哈希值
return Objects.hash(year, month, day);
}
@Override
public String toString(){
return "MyDate{" + "year=" + year + ", month=" + month + ", day=" + day + '}';
}
// 省略构造器、getter和setter方法
}
Here's a basic test:
public class TestHashSet {
public static void main(String[] args) {
// 创建HashSet集合
HashSet<String> set = new HashSet<>();
// 添加元素
set.add("Java");
set.add("Java"); // 重复元素
set.add("Python");
set.add("C");
// 输出集合(不保证顺序)
System.out.println(set);
// 创建HashSet集合
HashSet<MyDate> set1 = new HashSet<>();
// 添加元素
set1.add(new MyDate(2020, 1, 1));
set1.add(new MyDate(2020, 1, 1)); // 重复元素
set1.add(new MyDate(2020, 1, 2));
// 输出集合(不保证顺序)
System.out.println(set1);
}
}
Output analysis:
调用hashCode()方法
调用hashCode()方法
调用equals()方法
调用hashCode()方法
[MyDate{year=2020, month=1, day=2}, MyDate{year=2020, month=1, day=1}]
hashCode()
From the above results, we found that the method will be called automatically to set a Hash value for the element through the Hash algorithm to determine the storage location in the hash table when the add operation is performed . When adding duplicate elements, hashCode()
the method is also first called to set a Hash value for the element through the Hash algorithm. At this time, the hash value is found to already exist, so the equals()
method is automatically called for further comparison. If it is determined that it is the same element, the addition operation will not be performed. .
It can be seen that by using HashMap
as its internal structure, HashSet
the performance advantage of the hash table is utilized. Not only that, but it also follows a very strong object equality checking strategy. This makes HashSet
is an efficient and reliable option for Java collections, both in performance and semantics, making it an ideal choice HashSet
for implementing the interface.Set
3.LinkedHashSet
In Java's collection framework,
LinkedHashSet
is a specialSet
implementation that inherits fromHashSet
and provides some additional features.
LinkedHashSet
is HashSet
an extended subclass of , and its class diagram is as follows:
HashSet
It is implemented based on a hash table, which provides excellent element insertion and lookup performance. However, it does not preserve the insertion order of elements, which can be a disadvantage in some scenarios. That's LinkedHashSet
why was introduced. On the basis of , it HashSet
adds two pointer fields before
and after
, which are used to link each element node, thus recording the order of adding elements.
Therefore, LinkedHashSet
it is actually a combined structure of a linked list and a hash table. The linked list maintains the insertion order of elements, while the hash table ensures fast element insertion and lookup performance . This structure LinkedHashSet
not only inherits HashSet
the high performance of , but also provides a predictable iteration order .
In terms of insertion performance, due to LinkedHashSet
the need to maintain an additional linked list, its performance is slightly lower HashSet
. However, this performance hit is usually acceptable, especially in scenarios where insertion order needs to be preserved.
In terms of iterative access performance, LinkedHashSet
it performs very well. Since it maintains a linked list running in insertion order, it provides efficient and stable performance Set
when traversing all elements of . LinkedHashSet
This makes it a very good choice for applications that require frequent iterations.
Here's a simple use case:
public class LinkedHashSetTest {
public static void main(String[] args) {
LinkedHashSet<String> set = new LinkedHashSet<>();
// 添加元素
set.add("Java");
set.add("Java"); // 重复元素
set.add("Python");
set.add("C");
// 输出集合(保证顺序)
System.out.println(set); // [Java, Python, C]
}
}
4.TreeSet
TreeSet
It is an important member of the Java collection framework, which provides a collection with sorting and deduplication as its core features.
TreeSet
The bottom layer is TreeMap
implemented based on , TreeMap
and the bottom layer data structure is a red-black tree, a self-balancing binary search tree . Due to the nature of the red-black tree, elements TreeSet
can be inserted, deleted, and searched efficiently, while ensuring the ordering of elements .
Its class diagram is as follows:
TreeSet
The two core features of are deduplication and sorting of elements .
The logic of deduplication mainly depends on the way the elements are compared. TreeSet
Two comparison methods are supported, namely natural sorting and custom sorting.
- For natural sorting,
TreeSet
collection elements are required to implementComparable
the interface, and overridecompareTo
the method. WhenTreeSet
a new element is added, the element'scompareTo
method is called to compare it with existing elements. If the return value is 0, indicating that the two elements are equal, the new element will not be added to theTreeSet
. - For custom sorting, you need to specify an object that implements the interface
TreeSet
when creating it .Comparator
When a new element is added, the method ofTreeSet
is called for element comparison. Likewise, if the method returns 0, no new elements will be added to the .Comparator
compare
compare
TreeSet
For sorting, TreeSet
two ways of natural sorting and custom sorting are supported:
- Natural ordering:
TreeSet
Requires collection elements to implementComparable
the interface and overridecompareTo
the method.compareTo
The return value of the method determines the sort order of the elements. - Custom sorting:
TreeSet
When creating an object, a object can be passed in through the constructorComparator
.Comparator
The method in the interfacecompare
will be used to sort the elements.
Here is a simple practical example:
public class TreeSetTest {
public static void main(String[] args) {
/*
* 默认情况下采用自然排序,会调用 Comparable 接口中的 compareTo 方法进行比较
* 1.对于字符串:按照 Unicode 编码值的大小进行比
* 2.对于自定义类型:需要实现 Comparable 接口,重写 compareTo 方法
* 3.对于整形:按照数值大小进行比较
* 4.对于浮点型:按照数值大小进行比较
* 5.对于布尔型:false < true
*/
TreeSet<String> set = new TreeSet<>();
// 添加元素
set.add("Java");
set.add("Java"); // 重复元素
set.add("Python");
set.add("C");
set.add("C++");
set.add("Go");
set.add("C#");
// 输出集合
System.out.println(set); // [C, C#, C++, Go, Java, Python]
}
}
public class TreeSetTest02 {
public static void main(String[] args) {
// 如果是定制排序,需要在创建 TreeSet 时传入 Comparator 接口的实现类对象,重写 compare 方法去自定义排序规则
TreeSet<String> set = new TreeSet<>((o1, o2) -> {
// 按照字符串长度比较
return o1.length() - o2.length();
});
// 添加元素
set.add("Java");
set.add("Python");
set.add("C");
set.add("C++");
set.add("Go");
// 输出集合
System.out.println(set); // [C, Go, C++, Java, Python]
}
}
5. Select the appropriate Set implementation
Which Set
implementation to choose depends mainly on your specific needs:
- If you just need a collection with no duplicate elements and don't care about the order of the elements, then
HashSet
is a good choice. It provides constant-time basic operations (add, remove, and contains). - If you care about the insertion order of elements, then
LinkedHashSet
is a better choice. ItHashSet
maintains the insertion order of elements using linked lists on the basis of . - If you need a sorted collection, then
TreeSet
is the best choice. It uses a red-black tree to store elements, providing an ordered collection view.
6. Summary
Here is a simple summary table:
characteristic | HashSet | LinkedHashSet | TreeSet |
---|---|---|---|
order of elements | out of order | insertion order | orderly |
allownull |
yes | yes | yes |
performance | high | medium | lower |
based on | HashMap | LinkedHashMap | TreeMap |
special function | none | record insertion order | to sort |