Every sentence: A truly happy man is one who can enjoy his creation. Those who are like sponges, who only take and do not give, will only lose their happiness. - "38 Letters from Rockefeller to His Son"
1. Basic idea
The data structure used by the ordered symbol table is a pair of parallel arrays, one for the key and one for the value. The array keys can be kept in order when put, and then use the index of the array to efficiently implement get() and other operations
2. API for ordered symbol table
/ | method | describe |
---|---|---|
void | put(Key key, Value value) | Store key-value pairs in the table |
Value | get(Key key) | Get the value corresponding to the key key |
boolean | contains(Key key) | Whether the key key exists in the table |
boolean | isEmpty() | Is the table empty |
int | size() | Number of key-value pairs in the table |
Key | min () | smallest key |
Key | max() | largest key |
Key | floor(Key key) | The largest key less than or equal to key |
Key | ceiling(Key key) | Minimum key greater than or equal to key |
int | rank(Key key) | number of keys smaller than key |
Key | select(int k) | rank k |
void | deleteMin() | delete the smallest key |
void | deleteMax() | delete the largest key |
int | size(Key lo, Key hi) | The number of keys between [lo...hi] |
Iterable | keys (Key lo, Key hi) | The number of keys between [lo...hi] |
Iterable | keys() | The set of all keys in the table, sorted |
3. Code implementation
package symboltable;
import com.sun.org.apache.xpath.internal.functions.FuncFloor;
import edu.princeton.cs.algs4.Queue;
public class BinarySearchST<Key extends Comparable<Key>, Value> {
private Key[] keys; //这里使用两个数组来保存键和值
private Value[] values;
private int N;
@SuppressWarnings("unchecked")
public BinarySearchST(int capacity) {
keys = (Key[]) new Comparable[capacity];
values = (Value[]) new Object[capacity];
}
public void put(Key key, Value value) {
int i = rank(key);
if(i < N && keys[i].compareTo(key) == 0) {
values[i] = value; //如果找到匹配的值则更新
}
for(int j = N; j > i; j--) { //将所有较大的元素全部向后移动一位
keys[i] = keys[j-1];
values[j] = values[j-1];
}
keys[i] = key;
values[i] = value;
N++;
}
public Value get(Key key) {
if(isEmpty())
return null;
int i = rank(key); //返回小于它的元素数量
if(i < N && keys[i].compareTo(key) == 0)
return values[i];
else
return null;
}
public Key delete(Key key) {
int i = rank(key);
if(keys[i].compareTo(key) == 0) { //如果找到元素,则将后面的元素向前移动一位
for(int j = i; j < N - 1; j++) {
keys[j] = keys[j + 1];
values[j] = values[j + 1];
}
N--;
return keys[i];
}
return null;
}
public boolean contains(Key key) {
int i = rank(key);
return keys[i].equals(key);
}
public boolean isEmpty() {
return N == 0;
}
public int size() {
return N;
}
public Key min() {
return keys[0];
}
public Key max() {
return keys[N-1];
}
public Key floor(Key key) {
int i = rank(key);
for(int j = i; j >= 0; j--) {
if(select(j).compareTo(key) != 1)
return select(j);
}
return null;
}
public Key ceiling(Key key) {
int i = rank(key);
return keys[i];
}
public int rank(Key key) {
int lo = 0, hi = N - 1;
while(lo <= hi) {
int mid = lo + (hi - lo) / 2;
int cmp = key.compareTo(keys[mid]);
if(cmp < 0)
hi = mid - 1;
else if(cmp > 0)
lo = mid + 1;
else
return mid; //如果找到该键,rank() 会返回该键 的位置,也就是表中小于它的键的数量
}
return lo; //如果不存在,lo 就是表中小于它的键的数量
}
public Key select(int k) {
return keys[k];
}
public void deledtMin() {
delete(min());
}
public void deleteMax() {
delete(max());
}
public int size(Key lo, Key hi) {
if(hi.compareTo(lo) < 0)
return 0;
else if(contains(hi))
return rank(hi) - rank(lo) + 1;
else {
return rank(hi) - rank(lo);
}
}
public Iterable<Key> keys(Key lo, Key hi){
Queue<Key> queue = new Queue<Key>();
for(int i = rank(lo); i < rank(hi); i++) { //将 lo~hi(不包括hi)的元素入队
queue.enqueue(keys[i]);
}
if(contains(hi)) //判断表中是否包含 hi
queue.enqueue(keys[rank(hi)]);
return queue;
}
public Iterable<Key> keys(){
return keys(min(), max());
}
public static void main(String[] args) {
BinarySearchST<Integer, String> binarySearchST = new BinarySearchST<>(10);
for(int i = 0; i < 5; i++) {
binarySearchST.put(i, "Timber" + i);
}
System.out.println("size = " + binarySearchST.size());
for(int k : binarySearchST.keys()) {
System.out.println("key:" + k + ", value:" + binarySearchST.get(k));
}
binarySearchST.delete(3);
System.out.println("删除后:");
for(int k : binarySearchST.keys()) {
System.out.println("key: " + k + ", value: " + binarySearchST.get(k));
}
System.out.println("小于等于 3 的最大键: " + binarySearchST.floor(3));
System.out.println("大于等于 3 的最小键: " + binarySearchST.ceiling(3));
}
}
3. Results display
4. rank() method analysis
At the heart of this implementation is the rank() method, which returns the number of less than a given key in the table. It first compares the key with the middle key, returns its index if equal, looks in the left half if it is less than the middle key, and looks in the right half if it is greater.
public int rank(Key key){
int lo = 0, hi = N -1;
while(lo <= hi){
int mic = lo + (hi - lo) / 2;
int cmp = key.compareTo(keys[mid]);
if(cmp < 0){
hi = mid - 1;
}else if(cmp > 0){
lo = mid + 1;
}else{
return mid;
}
}
}
Properties of the non-recursive version of binary search:
- If the key exists in the table, rank() returns the position of the key, that is, the number of keys in the table that are smaller than this key;
- If the key does not exist in the table, rank() should still return the number of keys in the table that are smaller than it, that is, the value of lo at the end of the loop is exactly equal to the number of keys in the table that are smaller than the key being looked up.
The trajectory of ranking using binary search in an ordered array is shown in the figure below. (Image via Algorithms, 4th Edition )
5. Performance Analysis
A binary search in an ordered array of N keys requires at most (lgN + 1) comparisons (whether successful or not). Inserting a new element into it requires worst-case accesses to the array ~2N times, so inserting N elements into an empty symbol table requires worst-case accesses to the arrays ~2N times.
The operating costs of the specific method are as follows:
method | Order of magnitude increase in time required to run |
---|---|
put() | N |
get() | calm |
delete() | N |
contains() | calm |
size() | 1 |
min () | 1 |
max() | 1 |
floor() | calm |
ceiling() | calm |
rank() | calm |
select() | 1 |
deleteMin() | N |
deleteMax() | 1 |
6. Comparison of sequential search and binary search
In general, binary search is much faster than sequential search. However, binary search is also not suitable for many applications. For example, the Leipzig Corpora database cannot be processed because lookups and inserts are mixed and the symbol table is too large.
The following table lists the performance characteristics of sequential search and binary search, with the order of magnitude increase in running time (binary search is the number of accesses to the array, and the others are the number of comparisons):
algorithm | Worst case (after N insertions) | Average case (after N insertions) | Is it efficient to support ordered operations? | ||
---|---|---|---|---|---|
find | insert | find | find | ||
Sequential search | N | N | N/2 | N | no |
binary search | lgN | 2N | lgN | N | Yes |
7. Write at the end
If there is anything wrong or suggestion, welcome criticism and correction.