Pure Java implementation of data structures (5/11) (Set & Map)

Set and Map are abstract or advanced data structures, as is the use of the underlying tree or hash is based on need.

You can think carefully about the difference between TreeMap / HashMap, TreeSet / HashSet is to
Only define user interface (operation same), regardless of the particular implementation, so even if the underlying BST is also (but inefficient)

(I still say it straight, if the order is not required, try to use it to achieve Hash)

Collection (Set)

A binary search tree is not repeated storage element, the BST is a good implementation for a set of infrastructure

Common applications

In fact, it is a major application: go heavy.

For example, the ArrayList elements inside through a cycle, and then placed in the set to see how many elements are not repeated.

Based on the underlying implementation BST

Specific implementation, you can simply look at the package BST:

//先定义好接口
public interface Set<E> {
    void add(E e);
    void remove(E e);

    boolean contains(E e);

    int getSize();
    boolean isEmpty();
}

//然后包装 BST 这个类
public class BSTSet<E extends Comparable<E>> implements Set<E> {

    private BST<E> bst;
  
     //构造函数
    public BSTSet() {
        bst = new BST<>();
    }

    @Override
    public void add(E e) {
        bst.add(e);
    }

    @Override
    public void remove(E e) {
        bst.remove(e);
    }

    @Override
    public boolean contains(E e) {
        return bst.contains(e);
    }

    @Override
    public int getSize() {
        return bst.getSize();
    }

    @Override
    public boolean isEmpty() {
        return bst.isEmpty();
    }
}

In fact, we can see that encapsulates BST.

Based on the underlying list

And BST are the same as dynamic data structures, list implementation SET advantage it?

Simple comparison:

The list of elements, not mandatory requirements of storage elements when ordered
Node class defines the list of internal easier

Because the list itself is not fully supported set of related operations, it is time to realize, or do some additional processing, such as the need to confirm the relevant element is not present then add the container.

import linkedlist.LinkedList1;

public class LinkedListSet<E> implements Set<E> {

    private LinkedList1<E> list;

    public LinkedListSet() {
        list = new LinkedList1<>();
    }

    @Override
    public void add(E e) {
        //不存在才添加
        if (!list.contains(e)) {
            list.addFirst(e); //O(1)，因为有头指针
        }
    }

    @Override
    public void remove(E e) {
        list.removeElem(e);
    }

    @Override
    public boolean contains(E e) {
        return list.contains(e);
    }

    @Override
    public int getSize() {
        return list.getSize();
    }

    @Override
    public boolean isEmpty() {
        return list.isEmpty();
    }

    @Override
    public String toString() {
        StringBuilder res = new StringBuilder();
        res.append("{ ");
        res.append(list.toString());
        res.append("} ");
        return res.toString();
    }

    public static void main(String[] args) {
        LinkedListSet<Integer> set = new LinkedListSet<>();

        //添加一些元素 2, 3, 2, 5
        set.add(2);
        set.add(3);
        set.add(2);
        set.add(5);
        set.add(5);
        System.out.println(set); //{ 5->3->2->null}
    }
}

Of course, there Hash-based implementation, these interfaces are also similar.

Complexity Analysis

Preliminary analysis of the major gaps should find out whether there is a

Based on the list is the need to find O (n), I found not to exist, was added; and the release of BST is O (logN) efficiency.

That increase, delete, search, list implementation will be slower than the tree implementation.

The worst case, the number of levels may also degenerate into linear, such as originally ordered sequence created BST collection implementations:

Precisely O (height), since the height may be logN or N. ( Do not take near-ordered sequence to create BST )

Better implementation should use self-balancing trees, such as red-black trees or AVL, for example, java.util.TreeSetis to use the red-black tree implementation. (Degradation does not occur, they can maintain homeostasis)

But the ability to have all of the support mechanism, will have the corresponding maintenance costs.

More orderly issues

In fact, the list is based on a set of unordered (underlying storage order is not maintained), and the insertion order stored sequentially related

Based on the BST, AVL, RBTree like 搜索树collection structure is an ordered set, it is automatically stored in the maintenance order , and irrelevant sequence insert .

Unordered collection with no advantage of it? Hash tables great way unordered collection is achieved. (Support random access, efficiency is very high)

Based on the search tree: the ordered set of elements having sequential
Based on the hash table: unordered collection of elements is not sequential

It is generally believed that larger set of tree-based search capabilities, but to achieve efficient than hash tables of time .

Mapping (Map)

There are several possible mapping, but here more attention is 1-1 mapping. Sometimes called Map, sometimes called a dictionary, plainly, is that you can quickly access a structured according to the key value.

(Refer to a variety of different languages)

The underlying implementation : In fact, the mapping (map) is a high-level data structure, the underlying implementation can also have multiple implementations . Chain can also be used, for example, BST to achieve the structure as follows:

// BST 实现
class Node {
    K key;
    V value;
    Node left;
    Node right;
}

// 链表实现
class Node {
    K key;
    V value;
    Node next;
}

And substantially similar to the above to achieve the set, say set can be seen as a special map; map can also be seen as a special set. (But more generally believed, the set as a special Map, namely Map <K, null>)

Interface definition

Map typically has the following basic operations, as follows:

public interface Map<K, V> {

    void add(K key, V value);
    V remove(K key);

    int getSize();
    boolean isEmpty();
    boolean contains(K key);

    V get(K key);
    void set(K key, V newValue);
}

Special note, this interface supports two generic parameters.
(5 modes of operation of the common data structure, where a total of 7)

List the underlying implementation

When a chain inside of the package, this time because Node has changed, it is not directly multiplexing the LinkedList (redefined Node)

Probably concrete realization of the following:

package map;

public class LinkedListMap<K, V> implements Map<K, V>{
    //先重新实现 节点内部类
    private class Node {
        public K key;
        public V value;

        public Node next;

        public Node(K key, V value, Node next) {
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public Node(K key, V value) {
            this(key, value, null);
        }

        public Node() {
            this(null, null, null);
        }

        @Override
        public String toString() {
            return key.toString() + ":" + value.toString();
        }
    }

    //成员 (和单链表一样)
    private int size;
    private Node dummyHead;

    public LinkedListMap() {
        dummyHead = new Node(); //用户并不清楚 dummyNode 的存在
        size = 0;
    }

    //私有函数 (拿到 key 所对应的 Node)
    // contains 要用到
    // 拿到 key 所对应的 value
    private Node getNode(K key) {
        //遍历，返回 key 所对应的 Node
        Node cur = dummyHead.next;
        while(cur != null) {
            if(cur.key.equals(key)) {
                return cur;
            } else {
                cur = cur.next;
            }
        }
        return null;
    }


    @Override
    public boolean contains(K key) {
        return getNode(key) != null;
    }

    @Override
    public V get(K key) {
        Node node = getNode(key);
        return node == null ? null : node.value;
    }

    @Override
    public int getSize() {
        return size;
    }

    @Override
    public boolean isEmpty() {
        return size == 0;
    }


    @Override
    public void add(K key, V value) {
        //添加新的节点 (key 必须唯一)
        if(!contains(key)) {
            //直接在链表头部添加
            dummyHead.next = new Node(key, value, dummyHead.next);

            //特别注意: size++
            size++;

        } else {
            //存在了就抛出异常 (你也可以去更新)
            throw new IllegalArgumentException("要新增的 Key 已经存在了");
        }
    }


    @Override
    public void set(K key, V newValue) {
        //找到 key 然后更新
        Node node = getNode(key);
        if(node != null) {
            node.value = newValue;
        } else {
            //要更新的 key 不存在，抛出异常
            throw new IllegalArgumentException("要更新的 Key 不已经");
        }

    }

    @Override
    public V remove(K key) {
        //类似单链表里面删除 elem 逻辑
        //从 dummyHead 开始找到相应节点的前一个节点
        Node prev = dummyHead; //这里的 prev 其实代表的是找到的节点前一个节点
        while(prev.next != null) {
            if(prev.next.key.equals(key)) {
                break;
            }
            prev = prev.next;
        }

        //找到了 break 的，还是自然结束的?
        if(prev.next != null) {
            //表明是找到的，break出来的
            Node delNode = prev.next;
            prev.next = delNode.next;
            delNode.next = null;
            size--;
            return delNode.value;
        }

        //自然结束的，说明没有找到要删除的元素
        return null;
    }

    @Override
    public String toString() {
        StringBuilder res = new StringBuilder();
        res.append("{");

        for(Node curr = dummyHead.next; curr != null; curr = curr.next) {
            res.append(curr.key + ":\"" + curr.value + "\"");
            if(curr.next != null) {
                res.append(", ");
            }
        }

        res.append("}");
        return res.toString();
    }
}

Simple test as follows:

    public static void main(String[] args) {
        Map<Integer, String> map = new LinkedListMap<>();

        //放入一些元素
        map.add(1, "one");
        map.add(2, "two");
        map.add(3, "three");
        System.out.println(map); //{3:"three", 2:"two", 1:"one"}，和添加顺序一致

        System.out.println(map.contains(3)); //true
        System.out.println(map.getSize()); //3
        System.out.println(map.get(1)); //one
    }

BST underlying implementation

Based on map bst can not be directly multiplexed to achieve bst, where you want to redefine the structure of Node

And Key must be comparable.

Generally implemented as follows: (note that many assisted method wherein internal)

public class BSTMap<K extends Comparable<K>, V> implements Map<K, V> {
    //定义 Node
    private class Node {
        public K key;
        public V value;

        public Node left, right;

        //构造函数
        public Node(K key, V value) {
            this.key = key;
            this.value = value;

            left = right = null;
        }
    }

    //定义成员
    private Node root;
    private int size;


    //定义构造器
    public BSTMap() {
        root = null;
        size = 0;
    }

    @Override
    public int getSize() {
        return size;
    }

    @Override
    public boolean isEmpty() {
        return size == 0;
    }

    // 其他函数和 BST 的实现保持一致

    @Override
    public void add(K key, V value) {
        root = add(root, key, value);
    }

    //返回操作后的子树 (根节点)
    private Node add(Node root, K key, V value) {
        if(root == null) {
            //找到了相应插入的位置，那么返回 (上层调用会接收这个子树)
            size++;
            return new Node(key, value);
        }

        //找到相应需要插入的位置
        if(key.compareTo(root.key) < 0) {
            //左子树上递归查找相关位置
            root.left = add(root.left, key, value);
        } else if(key.compareTo(root.key) > 0) {
            //右子树上递归查找需要插入的位置
            root.right = add(root.right, key, value);
        } else {
            //已经存在了？抛异常，还是更新
            throw new IllegalArgumentException("要添加的 Key 已经存在了");
        }
        return root; //返回操作完毕后的子树给上级 (这棵子树的 right 或者 left 已经添加了新元素)
    }


    //查询方法，一般需要借助，找到该节点的 私有方法
    //返回 key 所在的节点
    private Node getNode(Node root, K key) {
        //以当前节点作为 root 开始查询
        //还是用递归的写法
        if(root == null) {
            // 没有找到
            return null;
        }
        if(key.compareTo(root.key) == 0) {
            //找到了
            return root;
        } else if (key.compareTo(root.key) < 0) {
            //在左子树上去找
            return getNode(root.left, key);//返回从 root.left 这颗子树上的节点
        } else {
            return getNode(root.right, key);
        }
    }



    @Override
    public boolean contains(K key) {
        return getNode(root, key) != null;
    }

    @Override
    public V get(K key) {
        Node node = getNode(root, key);
        return node != null ? node.value : null;
    }


    @Override
    public void set(K key, V newValue) {
        Node node = getNode(root, key);
        if(node != null) {
            //存在，就更新
            node.value = newValue;
        } else {
            throw new IllegalArgumentException("要更新的 Key 不存在");
        }

    }

    //删除操作比较复杂 (这边需要使用融合技术，即找前驱或者后继元素)
    //先写4个辅助函数 (找前驱的 getMax, 找后继的 getMin )
    // 删除 max 并返回相应节点的 removeMax 或者 删除 min 并返回相应节点的 removeMin
    private Node getMin(Node root) {
        if(root.left == null) {
            return root;
        }
        //其他情况一直在左子树上查找
        return getMin(root.left);
    }

    //删除最小元素，然后返回这个子树 (根节点)
    private Node removeMin(Node root) {
        //最小元素一定在左子树上，让 root 的左子树接收即可
        if(root.left == null) {
            //左子树空了，这个时候需要把右子树嫁接到父节点上 (也就是返回给上级调用的 left)
            //此时最小值就是当前这个节点 root
            Node rightNode = root.right; //可能为空
            root.right = null; //把当前这个节点置空
            size--;

            return rightNode;
        }

        //左子树不空，继续找
        root.left = removeMin(root.left);
        return root;
    }


    private Node getMax(Node root){
        if(root.right == null) {
            return root;
        }
        //否咋一直找右子树
        return getMax(root.right);
    }

    //删除最大元素，然后返回这个子树 (根节点)
    private Node removeMax(Node root) {
        if(root.right == null) {
            //此时 root 就是最大节点了
            //把左子树嫁接到父节点吧 (即返回给上层调用)
            Node leftNode = root.left; //可能为 null，但返回给上层调用的 right
            root.left = null;
            size--;
            return leftNode;
        }

        //否则接续找
        root = root.right;

        return root;
    }

    //辅助函数写完，再来写真正的删除任意 key 的情况
    @Override
    public V remove(K key) {
        Node node = getNode(root, key);
        if(node != null) {
            //存在采取删除
            root = remove(key, root);
            return node.value;
        }

        return null; //不存在，则删除不了，应该抛异常的，这里就返回 null 算了
    }

    //返回操作完毕的相关子树 (根节点)
    private Node remove(K key, Node root) {
        //要操作的子树为空的时候，表明已经到了树的叶子下了
        if(root == null) {
            return null;
        }
        //其他情况，则递归的在 相关左右子树上进行相关删除操作 (返回操作后的子树)
        if(key.compareTo(root.key) < 0) {
            //左子树上删除，然后子树给 root.left
            root.left = remove(key, root.left);
        } else if(key.compareTo(root.key) > 0) {
            //右子树上删除，然后返回结果给 root.right
            root.right = remove(key, root.right);
        } else {
            //找了要删除的节点 compare 相等的情况
            // 这里还是要分情况处理一下: 左子树为空或者右子树为空，嫁接另一半子树
            //如果左右子树都不为空，那么久需要处理融合问题

            //简单的情况: 有一边子树空的情况
            if(root.left == null) {
                //嫁接右子树部分即可 (意思就是返回给上一级，自然有递归接收)
                Node rightNode = root.right;
                root.right = null;
                size--;
                return rightNode;
            }

            if(root.right == null) {
                //嫁接左子树部分即可
                Node leftNode = root.left;
                root.left = null;
                size--;
                return leftNode;
            }

            //先找后继，即右子树上查找最接近的节点 (右子树上查找最小)
            Node subcessorNode = getMin(root.right); //替代当前节点
            subcessorNode.right = removeMin(root.right); //返回右子树操作后的子树 (根节点)
            subcessorNode.left = root.left;

            //置空这个要删除的节点
            root.left = root.right = null;
            return subcessorNode;

        }
        return root;
    }

    private void inOrder(Node root) {
        //实现一个中序遍历方法
        if(root == null) {
            //以 root 为根的这颗子树空的, 不必打印直接返回
            return;
        }
        inOrder(root.left);
        System.out.print(root.key + ":" + root.value + " ");
        inOrder(root.right);
    }

    @Override
    public String toString() {
        inOrder(root);
        System.out.println();
        return super.toString();
    }
}

Simple test:


    public static void main(String[] args) {
        BSTMap<Integer, String> map = new BSTMap<>();
        map.add(2, "two");
        map.add(1, "one");
        map.add(3, "three");
        map.add(5, "five");

        System.out.println(map.getSize());
        System.out.println(map.contains(3));
        System.out.println(map);
    }

Print enter the result:

4
true
1:one 2:two 3:three 5:five 
map.BSTMap@1a407d53

Complexity Analysis

Deletions or changes in the investigation, as far as the look , for example, take a look at the case of the element is present, and then the list will slow . O (tree height) difference VS O (n), but the high tree may degenerate to O (n). (Average case or O (logN))

Similarly, to avoid the worst case, or to help make AVL tree more balanced number. (Reduced height)

Ordering problem

Order and disorder and its underlying or related.

If the underlying implementation based on BST, it is the ability to maintain a storage order (nothing to do with your insertion order).

Compare summary

Is generally believed that the underlying Map and Set implementations and is not much difference. (Usually a tree will be possible, specifically, is the red-black tree to achieve)

In other words, the underlying implementation Map-based packaging easier to implement Set out. (Default Value is set to null, then remove the get and set methods)

Java, TreeMap, TreeSet underlying implementation is based AVL (actually a red-black tree); and HashMap and HashSet is the underlying hash table based implementation. (But you do not have to use the time to care, because the upper interface is the same)

BTW: Many exercises There are several tricks to query already exists, it is deleted from the Set / Map in. (Much explanation)

Few words, or paste the code repository about it gayhub .