查找算法之——符号表（引入篇）

符号表的主要目的是用来存储键值对，也就是将一个键和一个值关联起来，它的主要操作为插入和查找。

这篇只是为下一篇文章作为抛砖引玉，为不熟悉符号表的朋友做了一个大体的介绍，在文章的结尾列出了符号表的基本操作，有一定了解的朋友可以跳的下一篇文章（二叉查找树）。

首先我们必须讨论几个基本问题，这在之后的思想中将会一直用到：

1、重复的键

符号表不允许出现重复的键，当向表中插入的键值对的键已经出现在标中，当前加入的键值对会覆盖原有的键值对，也就是进行了更新。

2、空键或空值

键不能为空，这在java机制中会产生异常。我们还规定，值也不能为空。（get（）方法能通过查找键返回其关联的值，当键不存在时get（）方法会返回null）

这个规定有两个作用：

第一、我们能判断一个键值对是否存在符号表中。

第二、我们可以将null作为put（）的第二个参数来实现删除操作，

3、删除操作

删除操作的实现有两种思路：即时删除和延时删除，延时删除就是上面提到的，先将其值置位为空，然后在某个时刻删除所有值为空的键。即时删除也就是立即从表中删除指定的键。（本文中将使用即时删除）

4、键的等价性

键可以是任意类型（我们将其设置为泛型），可以是integer，double和string等等，java已经为他们实现了equals（）方法来判断相等，如果是自定义的键，则需要重写equals（）方法。

下面我们进入正题：

一、无序链表中的顺序查找

顾名思义，无序链表就是通常意义上的链表，只是链表的节点被定义为了键值对的形式，无序链表每次插入操作会在头部插入一个新节点，而查找操作是从链表的头部开始一个个地遍历符号表并使用equals（）方法来进行匹配。这种方法的效率是非常低的（它的查找操作是线性级别的），查找第一个键需要1次比较，第二个键需要2次比较...因此平均比较次数为（1+2+...+n）/n =(n+1)/2~n/2。它无法适用于大型符号表。

 1 public class SequentialSearchST<Key, Value> {// 基于无序链表
 2     private int n;// number of key-value pairs
 3     private Node first;
 4 
 5     private class Node {// 链表节点的定义
 6         Key key;
 7         Value val;
 8         Node next;
 9 
10         public Node(Key key, Value val, Node next) {
11             this.key = key;
12             this.val = val;
13             this.next = next;
14         }
15     }
16 
17     public Value get(Key key) {// 查找给定的键，返回关联的值
18         for (Node x = first; x != null; x = x.next) {
19             if (key.equals(x.key)) {
20                 return x.val;
21             }
22         }
23         return null;
24     }
25 
26     public void put(Key key, Value val) {// 查找给定的键，找到就更新其值，没找到将其插入最前
27         // **********************
28         if (key == null) {
29             throw new IllegalArgumentException("first argument to put() is null");
30         }
31         if (val == null) {
32             delete(key);
33             return;
34         }
35         // *********************防御性代码，这保证了任何键的值都不为空
36         for (Node x = first; x != null; x = x.next) {
37             if (key.equals(x.key)) {
38                 x.val = val;
39                 return;
40             }
41         }
42         first = new Node(key, val, first);// 插入最前
43         n++;
44     }
45 }

二、有序数组中的二分查找

如果能用上二分查找的思想，我们就能把查找的效率提升到对数级别（当然前提是数组有序）

这时插入和查找算法都发生了改变，为了让数组前提有序，插入时我们会用rank（）方法来确定键的位置，再将此位置后的键后移一位，最后插入键。rank（）返回的是键在符号表中排名。在这里如果键存在rank（）将返回键在有序数组中的下标。

在这里我们用两个数组分别存储键值对的键和值，同一键值对在数组中的下标是一样的。

  1 public class BinarySearchST<Key extends Comparable<Key>, Value> {// 有序查找表（基于有序数组）
  2     private Key[] keys;
  3     private Value[] vals;
  4     private int n = 0;// 用于记录符号表中键值对的个数
  5 
  6     public BinarySearchST(int capacity) {// 动态调整大小
  7         keys = (Key[]) new Comparable[capacity];
  8         vals = (Value[]) new Object[capacity];
  9     }
 10 
 11     public boolean contains(Key key) {
 12         if (key == null) {
 13             throw new IllegalArgumentException("argument to contains is null");
 14         }
 15         return get(key) != null;
 16     }
 17 
 18     public int size() {
 19         return n;
 20     }
 21 
 22     public int size(Key lo, Key hi) {
 23         if (lo == null) {
 24             throw new IllegalArgumentException("first argument to size() is null");
 25         }
 26         if (hi == null) {
 27             throw new IllegalArgumentException("second argument to size() is null");
 28         }
 29         if (lo.compareTo(hi) > 0) {
 30             return 0;
 31         }
 32         if (contains(hi)) {
 33             return rank(hi) - rank(lo) + 1;
 34         } else {
 35             return rank(hi) - rank(lo);
 36         }
 37     }
 38 
 39     public Key min() {
 40         if (isEmpty()) {
 41             throw new NoSuchElementException("called min with empty symbol table");
 42         }
 43         return keys[0];
 44     }
 45 
 46     public Key max() {
 47         if (isEmpty()) {
 48             throw new NoSuchElementException("called max with empty symbol table");
 49         }
 50         return keys[n - 1];
 51     }
 52 
 53     public Value get(Key key) {
 54         if (key == null) {
 55             throw new IllegalArgumentException("argument to get is null");
 56         }
 57         if (isEmpty()) {
 58             return null;
 59         }
 60         int i = rank(key);
 61         if (i < n && keys[i].compareTo(key) == 0) {
 62             return vals[i];
 63         }
 64         return null;
 65     }
 66 
 67     private int rank(Key key) {// 基于有序数组的二分查找（迭代）
 68         if (key == null) {
 69             throw new IllegalArgumentException("argument to rank is null");
 70         }
 71         int lo = 0, hi = n - 1;
 72         while (lo <= hi) {
 73             int mid = lo + (hi - lo) / 2;
 74             int cmp = key.compareTo(keys[mid]);
 75             if (cmp > 0) {
 76                 lo = mid + 1;
 77             } else if (cmp < 0) {
 78                 hi = mid - 1;
 79             } else {
 80                 return mid;
 81             }
 82         }
 83         return lo;// 找不到的情况
 84     }
 85 
 86     public Iterable<Key> keys() {
 87         return keys(min(), max());
 88     }
 89 
 90     private Iterable<Key> keys(Key lo, Key hi) {
 91         if (lo == null) {
 92             throw new IllegalArgumentException("first argument to keys() is null");
 93         }
 94         if (hi == null) {
 95             throw new IllegalArgumentException("second argument to keys() is null");
 96         }
 97         Queue<Key> queue = new Queue<Key>();
 98         if (lo.compareTo(hi) > 0) {
 99             return queue;
100         }
101         for (int i = rank(lo); i < rank(hi); i++) {
102             queue.enqueue(keys[i]);
103         }
104         if (contains(hi)) {
105             queue.enqueue(keys[rank(hi)]);
106             //queue.enqueue(hi);
107         }
108         return queue;
109     }
110 
111     private boolean isEmpty() {
112         // TODO Auto-generated method stub
113         return n == 0;
114     }
115 
116     public void put(Key key, Value val) {
117         if (key == null) {
118             throw new IllegalArgumentException("first argument to put is null");
119         }
120         if (val == null) {
121             delete(key);
122             return;
123         }
124         int i = rank(key);
125         if (i < n && key.compareTo(keys[i]) == 0) {//已有元素进行更新
126             vals[i] = val;
127             return;
128         }
129         for (int j = n; j > i; j--)  {//键值对后移
130             keys[j] = keys[j-1];
131             vals[j] = vals[j-1];
132         }
133         keys[i] = key;
134         vals[i] = val;
135         n++;
136     }
137 
138     public void delete(Key key) {
139         if (key == null) {// 避免空指针错误
140             throw new IllegalArgumentException("argument to delete is null");
141         }
142         if (isEmpty()) {
143             return;
144         }
145         int i = rank(key);
146         if (i == n || key.compareTo(keys[i]) != 0) {
147             return;
148         }
149         for (int j = i; j < n - 1; j++) {//元素前移
150             keys[j] = keys[j + 1];
151             vals[j] = vals[j + 1];
152         }
153         n--;
154         keys[n] = null;
155         vals[n] = null;
156     }
157 }

我们默认使用的二分查找是迭代进行，下面给出递归的形式：

 1     public int rank(Key key,int lo,int hi) {
 2         if(hi<lo) {
 3             return lo;
 4         }
 5         int mid=lo+(hi-lo)/2;
 6         int cmp=key.compareTo(keys[mid]);
 7         if(cmp<0) {
 8             return rank(key,lo,mid-1);
 9         }else  if(cmp>0) {
10             return rank(key,mid+1,hi);
11         }else {
12             return mid;
13         }
14     }

如果理解了迭代的形式，就能很容易改写出递归了。

用对象的数组代替两个平行数组（求指点）：

定义一个以键和值为属性的对象，用对象数组来代替两个平行数组，由于本人学术不精，未能完成，主要问题是不知如何创建不同类型属性的泛型数组（key继承了comparable而value继承了object）。在此提出问题，希望高人指点！

 1 public class BSST<Key extends Comparable<Key>, Value> {
 2     class Item{//内部类
 3         Key key;
 4         Value val;
 5     }
 6     private Item[] item;
 7     private int n = 0;
 8     public BSST(int capacity) {// 动态调整大小
 9         item = (Item[]) new Object[capacity];//出现类型转换错误
10　　　　　//item = (Item[]) new Comparable[capacity];//出现类型转换错误
11
12     }
13 }

四、符号表的基本操作

符号表的基本操作远远不止本文提到的rank（）、put（）、get（）、delete()。下面将他们列出来，了解个大概，在下一篇文章中将会一一实现他们。

boolean contains(Key key):判断key是否存在于符号表中

int size(Key lo,Key hi)：返回lo到hi之间的键值对数量

Key min()：返回最小键

Key max()：返回最大键

Key floor(Key key)：返回小于等于key的最大键

Key ceiling(Key key)：返回大于等于key的最小键

Key select(int k)：返回排名为k的键

Iterable<Key> keys(Key lo,Key hi)：返回一个队列，包含lo-hi之间的所有键（已排序）

rank(select(k))==k ture

select(rank(key))==key ture

查找算法之——符号表（引入篇）

猜你喜欢