Algorithm Learning (Discipleship Program) 3-2 Hash Table and Bloom Filter and Classic Problem Study Notes

foreword

(7.3, there are still 3 lessons left, try your best!)

This article is the study notes for the eighth lecture of the Algorithm Course of Discipleship Program.

3-2 Chapter 5 Section 2: Hash Tables and Bloom Filters

(According to the usual practice, the shortest learning time is still challenged this time, and it is expected to be within 3 times the time)

(The difficulty of this course has risen sharply. In order to simplify the course content, the hash algorithm will not be discussed this time. You only need to understand that the hash algorithm can map any data structure into the target memory space as evenly as possible .)

learning target

  • Learn Hash Ideas
  • The focus is on the implementation of a special hash table Bloom filter.
  • Application Scenarios for Learning Hash Tables
  • Expect to have an overall conceptual understanding of the hash table through this lesson

hash table

  • A way to quickly obtain an index of stored information
  • Looking back at the array structure, the time complexity of the array fetching numbers based on the index is very low
  • Hash operation realizes mapping any data into an array subscript (dimension reduction to integer) (key operation)
  • Realize high-speed access to data
  • It is a structure with a strong sense of design
  • Complete high-speed access to data

Hash operation

  • Dimensionality reduction of datamap
  • Dimensionality reduction will lose information and information overlap will occur, so two different data may be mapped to the same coordinate during the hash operation (called hash collision)
  • Give an example of a collision caused by a simplified hash rule ( index=n%size)
  • Hash conflicts are multiple elements mapped to the same coordinates, such conflicts are inevitable (but expected to avoid, or avoid conflicts)
  • Hash collision avoidance scheme (collision handling)

conflict handling

There are 4 ways to design conflict rules:

  • Open Addressing
  • rehashing
  • Create common overflow areas
  • Chain address method (zipper method, it is recommended to use this method to deal with conflicts when designing a hash table by yourself)

Open Addressing

When an address is in use (conflict), the data is attempted to be mapped to a free address behind the address . The rule for mapping is linear or exponential, where linear selects equidistant mapping, and exponential is power exponential spacing mapping. (Linearity is called linear detection, and power exponents are calledrehash twiceBecause the exponent of 2 is commonly used as the spacing) and the use of these mapping rules is very flexible, the simple rule is generally chosen.

But both of them have the same idea and try to map to the idle address in the rear .

rehashing

When there is a conflict, try to use another set of hash rules to map the address , and if it fails again, change another set. This way of switching to other hash functions is re-hashing.
(But the mapping that this method can provide is limited, because the prepared hash rules are limited. Therefore, although this method can greatly reduce the collision rate, it needs to be used in conjunction with the remaining 3 rules )
(so you can use it yourself Not recommended when designing hashtables)

public overflow area

When there is a conflict, store new data in the public overflow area . This part of the public overflow area can be stored in any suitable data structure, such as building a hash table, or a red-black tree.
The difference between this method and other methods is that the conflicting data is given extra space.

chain address method

In the chain address method, each location of the hash table stores the head node of the linked list (when there is a conflict, the linked list is extended). Due to the use of the linked list structure, it is actually equivalent to re-upgrading the dimension to a certain extent.

Expanded hash table

When hash conflicts occur too frequently in a hash table, it is necessary to expand the hash table (or expand the capacity if you think it needs to be expanded). Generally, expanding the capacity of the hash table is doubled.
The time complexity of the hash table is the amortized time complexity o(1). The principle is that when the current size of the hash table is n, assuming that the current size is reached through expansion, the number of expansions will solve n, so the amortized time for each element The complexity is 1.

The specific operation performed during expansion is to map the old data to the new space, and then overwrite the old space with the new space

Design a Simple Hash Table

(In the hash table designed below, there is a problem, only the data is accessed, but the stored data is not deleted, this class is not covered, skip consideration) The core operation of the hash table
:

  • When inserting, if there is no element in the current position, it will be inserted; otherwise, the conflict processing rules will be executed.
  • When searching, if there is an element at the current position and it is equal to the element to be searched, return it; otherwise, search in the common overflow area
  • When expanding, the data in the old table is stored in the new area, and the elements processed by the conflicting rules are also tried to be stored in the new area

(Take the zipper method as an example)

class MyHashTable {
    
    

    private Node [] data ;
    private int size ;
    private int nowDataLen;

    public MyHashTable () {
    
    
    	data = new Node [16];
    	size = data.length;
    	nowDataLen = 0;
    }
    
    public MyHashTable (int len) {
    
    
    	data = new Node [len];
    	size = data.length;
    	nowDataLen = 0;
    }
    
    private int hash_func(String str) {
    
     //自定义哈希规则
    	char [] c = str.toCharArray();
    	int seed = 131,hash =0;
    	for(int i=0;i<c.length;i++) {
    
    
    		hash = hash*seed + c[i];
    	}
    	return hash & 0x7fffffff;
    }
    
    public void expand() {
    
    
    	int  n = size*2;
    	MyHashTable ht = new MyHashTable(n);
    	for(int i = 0;i < size;i++) {
    
    
    		Node node = data[i];
    		if(node!=null) {
    
    
    			node= node.next;
        		while( node !=null) {
    
    
        			ht.insert(node.data);
        			node= node.next;
        		}
    		}
    	}
    	data = ht.data;
    	size = data.length;
    }
    
    public void insert(String str) {
    
    
    	
    	Node wantFind = find(str);
    	if(wantFind != null) return;
    	
    	int key = hash_func(str);
        int index = key % size;  
        if(data[index] == null)
        	data[index] = new Node();
        
        data[index].insertNext(new Node(str,null));
        nowDataLen ++;
        
        if(nowDataLen>size*3)  //自定义规则为负载达到3倍进行一次扩容
        	expand();
    }
    
   
    public boolean contains(String str) {
    
    
    	
    	Node wantFind = find(str);
    	if(wantFind == null)
    		return false;
    	
    	return true;
    }
    
    private Node find(String str) {
    
    
    	if(str == null) return null;
      	int key = hash_func(str);
    	int index = key % size;
        Node wantSet = data[index];
        if(wantSet==null){
    
    
        	return null;
        }else{
    
    
            while(wantSet!=null){
    
    
                if(str.equals(wantSet.data))
                    return wantSet;
                wantSet = wantSet.next;
            }
        }
    	return null;
    }
    
}

class Node{
    
    

    public String data;
    public Node next;
    
    public Node() {
    
    }

    public Node(String data,Node next){
    
    
        this.data = data;
        this.next = next;
    }

    public void removeNext (){
    
    
        if(next == null)return;
        Node p = next;
        next = p.next;
    }
    
    public void insertNext (Node newNode){
    
    
    	newNode .next =  next;
    	next =  newNode;
    }
}

Summarize

There are 2 main points in designing a hash table:

  • Design hash rules (different data, different situations, different hash functions are applicable) (but all expect to achieve the maximum space utilization)
  • Design conflict handling (similarly, different conflict rules apply to different situations)

There are 4 ways to design conflict rules:

  • Open Addressing
  • rehashing
  • Create common overflow areas
  • Chain address method (zipper method, it is recommended to use this method to deal with conflicts when designing a hash table by yourself)

Both open addressing and rehashing are inwithin the original data rangeTo reduce the probability of conflict, the establishment of a public overflow area is to hand over the conflicted content toother rulesprocessing, the chain address method is to be performed againAscensionSo as to solve the conflict caused by dimensionality reduction.

In summary:

The hash table is a set of data storage structures based on the nature of the array and using the hash idea for address mapping to achieve an amortized time performance of O(1) with a strong sense of design. It needs to cooperate with the conflict rules during use, and the expansion rules avoid hash conflicts. .

bloom filter

Use hash rules to achieve: to ensure that a certain data that needs to be judged has appeared.

Compare hash table

The hash table is used to judge whether a stored value exists. The hash table must ensure that a certain data exists or does not exist, and the stored content can be obtained. For address conflicts caused by dimensionality reduction operations, the hash table needs to design conflict rules and expand according to the situation. To sum up:
the storage space of the hash table is related to the number of elements.

The Bloom filter is used to judge whether a certain data must not exist, so the address conflict caused by the hash operation is allowed, and there is no need to obtain the stored value, so the implementation is simpler and no expansion is required. But it needs to be cleared (if it is not cleared, it still needs to be expanded, otherwise the filtering performance will decrease). To sum up:
the storage space of Bloom filter has nothing to do with the number of elements.

Examples of application scenarios

When the crawler code gets information from a URL, it does not expect to get information from the same URL, soNeed a rule to circumvent already passed URLs.
If you use a hash table, you can judge by storing URLs, but with the development of web pages, URL information becomes more and more abundant, and continuing to use hash tables will cause a lot of space overhead. The requirement is only to avoid duplication, and there is actually no requirement for the stored content.
The use of the Bloom filter ensures that the URLs that have passed the judgment are all URLs that have not passed, and the Bloom filter does not store specific URLs, thereby greatly reducing the use of space. Although the accuracy is not as good as the hash table,When there is a conflict, misjudgment will cause part of the URL to be skipped, but this loss is usually acceptable

principle

Open up a piece of storage space, only store binary (compared with the hash table, the hash table needs to store a suitable data structure), and use a variety of different hash rules (such as 3 types) at the same time when performing filtering to reduce dimension map,Set the value in the storage space mapped by each rule to 1(Compared to the hash table, the hash table stores the data in the storage space to which it is mapped).Before setting the bit, as long as the address mapped by a rule does not store 1, it is considered that the data has not appeared

Analysis pros and cons:

  • Pros: Low space requirements . Filtering more complex information requires less space than when using a hashing scheme.
  • Bad: As the amount of filtered data increases, the probability of misjudgment increases .

(Understanding from another perspective, it is obviously possible to use only one hash rule for the Bloom filter. When there is a misjudgment, there is a hash conflict
) The filter only needs to ensure that the data appears for the first time, so hash collisions can be tolerated, but it is still expected to reduce the impact of hash collisions) (
so the probability of misjudgment is lower for each additional set of hash rules)
(but the storage space opened The size is limited, so the higher the space utilization rate, the higher the misjudgment rate. Assuming that the space is fully utilized, there will be 100% misjudgment)

Code implementation (omitted)

(There is no expansion part in the Bloom filter design below, so it is a filter with limited use)
(I forgot if I didn’t write an explanation)

Summarize

Bloom filters are often used in big data scenarios (url) and scenarios with high information security (cannot hold original data)

To sum up:
Bloom filter is a data structure designed based on the hash idea to store the occurrence of data, and it also has good design.

classic examples

Basic-encapsulation of hash structure

leecode-705. Design hash collection

Source: https://leetcode-cn.com/problems/design-hashset/

Design a hash set (HashSet) without using any built-in hash table library.

Implement the MyHashSet class:

  • void add(key) Insert the value key into the hash collection.
  • bool contains(key) returns whether the value key exists in the hash set.
  • void remove(key) removes the given value key from the hash collection. If the value is not in the hash set, do nothing.

Example:

输入:
["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"]
[[], [1], [2], [1], [3], [2], [2], [2], [2]]
输出:
[null, null, null, true, false, null, true, null, false]

解释:
MyHashSet myHashSet = new MyHashSet();
myHashSet.add(1);      // set = [1]
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(1); // 返回 True
myHashSet.contains(3); // 返回 False ,(未找到)
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(2); // 返回 True
myHashSet.remove(2);   // set = [1]
myHashSet.contains(2); // 返回 False ,(已移除)

hint:

  • 0 <= key <= 10^6
  • 最多调用 10^4 次 add、remove 和 contains 。

problem solving ideas

This question is to use example questions to review the hash table and do it.
The zipper method is used to deal with the conflict problem and to deal with the element deletion problem.

(When referring to the solution later, I learned three sets of better problem-solving ideas, source: https://leetcode-cn.com/problems/design-hashset/solution/yi-ti-san-jie-jian- dan-shu-zu-lian-biao-nj3dg/)

(Additional solution 1: simple array)

Use an array containing all possible numerical cases to store information, and the array value is the subscript.
Quite rude, declare a Boolean array with a size of 1000009. boolean[] nodes = new boolean[1000009];
This idea is similar to Bloom filter. The core is a single hash, and it ensures that there is absolutely no conflict (because there is no dimensionality reduction). In this case, all occurrence of numbers.
(Based on the requirements of the topic: 0 <= key <= 106)
(code omitted)

(Additional solution 2: linked list-hash table)
Use a linked list array containing all possible digital situations to store information, and the array value is the subscript. It is also quite rude, setting the size of the linked list to 10009
(based on 最多调用 10^4 次 add、remove 和 contains 。)

This set of solutions is to design a simple hash table, but the sophistication of the production is much better than mine. Its key performance optimization getIndexmaximizes the discreteness of space in terms of function and initial space setting division.

(Additional solution 3: bucket array)

Use an Integer array containing all digital situations to store information. According to the 32-bit characteristic of the Integer type, all possible types are packaged in groups of 32 bits, so divided.
Since the data range is 0 <= key <= 10^6, the maximum number of ints we need will not exceed 40000. (actually less if so considered)

(code omitted)

sample code

(Basic solution, design a simple hash table, and use a linked list to deal with element conflicts)
(After I finished the solution, I found that there was a slight gap between what I wrote and the examples in the concept part. It was a little more complicated, but it did not affect the problem solving)

class MyHashSet {
    
    

    private Node [] data ;
    private int size ;

    /** Initialize your data structure here. */
    public MyHashSet() {
    
    
    	data = new Node [100];
    	size = data.length;
    }
    
    public void add(int key) {
    
    
        int index = key % size;
        Node wantSet = data[index];
        if(wantSet==null){
    
    
        	data[index] = new Node(key,null);
        }else{
    
    
            while(wantSet!=null){
    
    
                if(wantSet.key == key)
                    return;
                if(wantSet.next==null) {
    
    
                	wantSet.next = new Node(key,null);
                	return;
                }
                wantSet = wantSet.next;
            }
            
        }
    }
    
    public void remove(int key) {
    
    
        int index = key %size;
        Node wantRmeove = data[index];
        if(wantRmeove==null){
    
    
        	
        }else{
    
    
            Node dummyHead = new Node(0,null);
            dummyHead.next = wantRmeove;
            if( wantRmeove.key ==key ){
    
    
                dummyHead.removeNext();
                data[index] = dummyHead.next;
                return;
            }
            while(dummyHead!=null&&dummyHead.next!=null){
    
    
                if(dummyHead.next.key == key){
    
    
                    dummyHead.removeNext();
                }
                dummyHead = dummyHead.next;
            }
        }
    }
    
    /** Returns true if this set contains the specified element */
    public boolean contains(int key) {
    
    
    	Node wantFind = findSet(key);
    	if(wantFind == null)
    		return false;
    	
    	return true;
//        int index = key %size;
//        Node wantSet = data.get(index);
//        if(wantSet==null){
    
    
//            return false;
//        }else{
    
    
//            while(wantSet!=null){
    
    
//                if(wantSet.key == key)
//                    return true;
//                wantSet = wantSet.next;
//            }
//            return false;
//        }
    }
    
    //上述代码可以稍微简化一下,可以用下方这个函数来替代(但我只简化了一个)
    
    private Node findSet(int key) {
    
    
    	int index = key % size;
        Node wantSet = data[index];
        if(wantSet==null){
    
    
        	return null;
        }else{
    
    
            while(wantSet!=null){
    
    
                if(wantSet.key == key)
                    return wantSet;
                wantSet = wantSet.next;
            }
        }
    	return null;
    }

}

class Node{
    
    

    public int key;
    public Node next;

    public Node(int key,Node next){
    
    
        this.key = key;
        this.next = next;
    }

    public void removeNext (){
    
    
        if(next == null)return;
        Node p = next;
        next = p.next;
    }
}

(linked list method)

class MyHashSet {
    
    
    // 由于使用的是「链表」,这个值可以取得很小
    Node[] nodes = new Node[10009];

    public void add(int key) {
    
    
        // 根据 key 获取哈希桶的位置
        int idx = getIndex(key);
        // 判断链表中是否已经存在
        Node loc = nodes[idx], tmp = loc;
        if (loc != null) {
    
    
            Node prev = null;
            while (tmp != null) {
    
    
                if (tmp.key == key) {
    
     
                    return;
                }
                prev = tmp;
                tmp = tmp.next;
            }
            tmp = prev;
        }
        Node node = new Node(key);

        // 头插法
        // node.next = loc;
        // nodes[idx] = node;

        // 尾插法 
        if (tmp != null) {
    
    
            tmp.next = node;
        } else {
    
    
            nodes[idx] = node;
        }
    }

    public void remove(int key) {
    
    
        int idx = getIndex(key);
        Node loc = nodes[idx];
        if (loc != null) {
    
    
            Node prev = null;
            while (loc != null) {
    
    
                if (loc.key == key) {
    
    
                    if (prev != null) {
    
    
                        prev.next = loc.next;
                    } else {
    
    
                        nodes[idx] = loc.next;
                    }
                    return;
                }
                prev = loc;
                loc = loc.next;
            }
        }
    }

    public boolean contains(int key) {
    
    
        int idx = getIndex(key);
        Node loc = nodes[idx];
        if (loc != null) {
    
    
            while (loc != null) {
    
    
                if (loc.key == key) {
    
    
                    return true;
                }
                loc = loc.next;
            }
        }
        return false;
    }

    static class Node {
    
    
        private int key;
        private Node next;
        private Node(int key) {
    
    
            this.key = key;
        }
    }
    
    int getIndex(int key) {
    
    
        // 因为 nodes 的长度只有 10009,对应的十进制的 10011100011001(总长度为 32 位,其余高位都是 0)
        // 为了让 key 对应的 hash 高位也参与运算,这里对 hashCode 进行右移异或
        // 使得 hashCode 的高位随机性和低位随机性都能体现在低 16 位中
        int hash = Integer.hashCode(key);
        hash ^= (hash >>> 16);
        return hash % nodes.length;
    }
}

作者:AC_OIer
链接:https://leetcode-cn.com/problems/design-hashset/solution/yi-ti-san-jie-jian-dan-shu-zu-lian-biao-nj3dg/
来源:力扣(LeetCode)
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

leecode-706. Designing Hash Maps

Source: https://leetcode-cn.com/problems/design-hashmap/

Design a hash map (HashMap) without using any built-in hash table library.

Implement the MyHashMap class:

  • MyHashMap() initializes the object with an empty map
  • void put(int key, int value) Insert a key-value pair (key, value) into HashMap. If key already exists in the map, update its corresponding value value.
  • int get(int key) Returns the value mapped by a specific key; if the mapping does not contain the key's mapping, returns -1.
  • void remove(key) If there is a mapping of key in the mapping, remove the key and its corresponding value.

Example:

输入:
["MyHashMap", "put", "put", "get", "get", "put", "get", "remove", "get"]
[[], [1, 1], [2, 2], [1], [3], [2, 1], [2], [2], [2]]
输出:
[null, null, null, 1, -1, null, 1, null, -1]

解释:

MyHashMap myHashMap = new MyHashMap();
myHashMap.put(1, 1); // myHashMap 现在为 [[1,1]]
myHashMap.put(2, 2); // myHashMap 现在为 [[1,1], [2,2]]
myHashMap.get(1);    // 返回 1 ,myHashMap 现在为 [[1,1], [2,2]]
myHashMap.get(3);    // 返回 -1(未找到),myHashMap 现在为 [[1,1], [2,2]]
myHashMap.put(2, 1); // myHashMap 现在为 [[1,1], [2,1]](更新已有的值)
myHashMap.get(2);    // 返回 1 ,myHashMap 现在为 [[1,1], [2,1]]
myHashMap.remove(2); // 删除键为 2 的数据,myHashMap 现在为 [[1,1]]
myHashMap.get(2);    // 返回 -1(未找到),myHashMap 现在为 [[1,1]]
 

hint:

  • 0 <= key, value <= 106
  • 最多调用 104 次 put、get 和 remove 方法

problem solving ideas

This question is similar to the previous question. You can refer to the idea of ​​the previous question. The change that needs to be made is to change the stored elements to key-value pairs, and use the key value as the retrieval condition.
(key-value pairs need to design a class to encapsulate them)
(code omitted)

Application - Application of Hash Ideas

leecode-interview question 16.25. LRU cache

Source: https://leetcode-cn.com/problems/lru-cache-lcci/

Design and build a "least recently used" cache that removes the least recently used item. A cache should map from keys to values ​​(allowing you to insert and retrieve values ​​for specific keys), with a maximum capacity specified at initialization time. When the cache gets filled up, it should remove the least recently used items .

It should support the following operations: getting data get and writing data put.

  • Get data get(key) - Get the value of the key (always positive) if the key exists in the cache, otherwise return -1.
  • write data put(key, value) - If the key does not exist, write its data value. When the cache capacity reaches its upper limit, it should delete the least recently used data value before writing new data, thus making room for new data values.

Example:

LRUCache cache = new LRUCache( 2 /* 缓存容量 */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // 返回  1
cache.put(3, 3);    // 该操作会使得密钥 2 作废
cache.get(2);       // 返回 -1 (未找到)
cache.put(4, 4);    // 该操作会使得密钥 1 作废
cache.get(1);       // 返回 -1 (未找到)
cache.get(3);       // 返回  3
cache.get(4);       // 返回  4

problem solving ideas

First understand LRU. LRU is a data exchange strategy. According to a certain rule, what to store and what to discard are selected. This rule is called the LRU cache strategy.
Currently this rule is:

  • Remove an element from the beginning when the element needs to be reduced
  • When accessing an element, move the element to the end
  • When adding elements, add an element to the end, and perform element reduction if the total exceeds the space

So far, I first thought that the stored elements should be a queue, and this queue needs to be implemented with a linked list to facilitate the movement of elements. The
second consideration is how to query elements conveniently and quickly?
Considering that the index of the array structure is needed quickly, you should think of hash-related content, combined with the linked list, then this question useshash listto solve.

Hash linked list: The underlying data structure is a linked list, but it also has a hash table for querying nodes.

In addition, in order to move any node in this question, you need to useDoubly linked listIn this way, the front and back nodes of any node can be obtained for convenient splicing.

(This hash linked list has to be made by yourself)
(But in order to simplify the production, I use HashMap to implement this hash table that stores the linked list)

In making a linked list, in order to facilitate subtraction and addition, it is necessary to hold the head and tail. Prepare 3 methods for this, delete the head, remove specific elements, insert the tail (for the convenience of insertion and deletion, set a virtual head and virtual tail) (
deletion and addition need to be performed in the hash table)

(An additional note is that when the key is the same, the act of reassignment is equivalent to using, so the logic of accessing elements needs to be executed once)

(However, this question examines the knowledge of linked lists and the application of the content learned in the course)

sample code


class LRUCache {
    
    

    private HashMap <Integer,Node> nodeMap ;
    private int size ;
    private Node linkNodeHead;
    private Node linkNodeEnd;
    private int linkNodeLen;

    public LRUCache(int capacity) {
    
    
        nodeMap = new HashMap <Integer,Node> (capacity);
        size =  capacity;
        linkNodeHead = new Node();
        linkNodeHead.key =100;
        linkNodeEnd = new Node();
        linkNodeEnd.key = -200;
        linkNodeLen = 0;
        linkNodeHead.next = linkNodeEnd;
        linkNodeEnd.previous = linkNodeHead;
    }
    
    public int get(int key) {
    
    
        Node wantGet = nodeMap.get(key);
        //System.out.println("g"+key);

        if(wantGet==null) {
    
    
            return -1;
        }else{
    
    
            wantGet.removeNode();
            //System.out.println("g--:"+wantGet.key);
                    
            linkNodeHead.insertNext(wantGet);
    
            return wantGet.value;
        }

    }
    
    public void put(int key, int value) {
    
    
        Node wantGet = nodeMap.get(key);
        if(wantGet!=null){
    
    
            wantGet.value = value; 
            nodeMap.put(key,wantGet);
            get(key);
            return;
        }

        if(linkNodeLen>= size ){
    
    
            int deleteKey = linkNodeEnd.previous.key;
            nodeMap.remove(deleteKey );
            //System.out.println("d-:"+linkNodeEnd.previous.key);
            linkNodeEnd.previous.removeNode();
            //System.out.println("d---:"+linkNodeEnd.previous.key);
        }else{
    
    
            linkNodeLen++;
        }
            

        Node wantSet = new Node(); 
        wantSet.key = key;
        wantSet.value = value;

        nodeMap.put(key,wantSet);

        linkNodeHead.insertNext(wantSet);

        // if(linkNodeLen == 2){
    
    
        //     System.out.println("n--:"+linkNodeEnd.previous.key);
        // }
    }
}

/**
 * Your LRUCache object will be instantiated and called as such:
 * LRUCache obj = new LRUCache(capacity);
 * int param_1 = obj.get(key);
 * obj.put(key,value);
 */

class Node {
    
    
    int key;
    int value;
    Node next;
    Node previous;

    public Node(){
    
    

    }

    public void removeNode(){
    
    
        if(previous!=null){
    
    
            previous.next = next;
        }
        if(next!=null){
    
    
            next.previous = previous;
        }
        previous = null;
        next = null;
    }

    public void insertNext(Node set){
    
    
        if(next!=null)
            next.previous = set;
        set.next = next;

        set.previous = this;
        next = set;

    }

}


(hapiness)

执行用时:
26 ms
, 在所有 Java 提交中击败了
100.00%
的用户
内存消耗:
46.4 MB
, 在所有 Java 提交中击败了
69.67%
的用户

Leecode-535. Encryption and Decryption of TinyURL

Link: https://leetcode-cn.com/problems/encode-and-decode-tinyurl

TinyURL is a URL simplification service, for example: when you enter a URL https://leetcode.com/problems/design-tinyurl, it will return a simplified URL http://tinyurl.com/4e9iAk.

Requirement: Design a TinyURL encryption encode and decryption decode method. There is no limit to how your encryption and decryption algorithms can be designed and operated, you just need to ensure that a URL can be encrypted into a TinyURL, and this TinyURL can be restored to the original URL using the decryption method.

problem solving ideas

This question is to realize a mapping from a short URL to an original URL, and use a hash table to manage key-value pairs. But one problem needs to be solved, how to generate a short URL.

Therefore, converting the original URL to a short URL needs to be understood as the need to design a hash operation by yourself.

The original hash operation generates an integer, which can be understood as a decimal system, and the string to be generated this time can be understood as a 62 system, (there are 62 changes in a character position with upper and lower case plus numbers)

but! This question does not require data to be stored in a certain space for management, so this question is actually a question of a compression algorithm, which cannot allow conflicts, which contradicts the hash rules, which try to avoid conflicts. (Although the logic of the encryption algorithm is also try not to conflict)

Therefore, from the point of view of the question, if the limited space can be realized, the possibility of maximizing the storage content can be larger than the test range of the test question.

Therefore, the proposed scheme is to use 62 base for counting, and allocate a space for each URL

sample code

public class Codec {
    
    
    String alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    HashMap<String, String> map = new HashMap<>();
    Random rand = new Random();
    String key = getRand();

    public String getRand() {
    
    
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 6; i++) {
    
    
            sb.append(alphabet.charAt(rand.nextInt(62)));
        }
        return sb.toString();
    }

    public String encode(String longUrl) {
    
    
        while (map.containsKey(key)) {
    
    
            key = getRand();
        }
        map.put(key, longUrl);
        return "http://tinyurl.com/" + key;
    }

    public String decode(String shortUrl) {
    
    
        return map.get(shortUrl.replace("http://tinyurl.com/", ""));
    }
}

作者:LeetCode
链接:https://leetcode-cn.com/problems/encode-and-decode-tinyurl/solution/tinyurlde-jia-mi-yu-jie-mi-by-leetcode/
来源:力扣(LeetCode)
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

leecode-187. Repeated DNA sequences

Link: https://leetcode-cn.com/problems/repeated-dna-sequences

All DNA consists of a series of nucleotides abbreviated 'A', 'C', 'G' and 'T', eg: "ACGAATTCCG". When studying DNA, it can sometimes be very helpful to identify repetitive sequences in DNA.

Write a function to find all target substrings of length 10 that occur more than once in the DNA string s.

Example 1:

输入:s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
输出:["AAAAACCCCC","CCCCCAAAAA"]

problem solving ideas

The expected content of this question is to get the number of occurrences of the character substring. Therefore, you only need to use the hash table to continuously judge whether it already exists when storing, and if it already exists, add 1 to the counter. When the count result will be greater than 1 for the first time, store it in the result set.

(This question is solved with a ready-made hash table, HashMap)

The way to generate a substring of a string is to traverse the total string.

(code omitted)

leecode-318. Maximum word length product

Link: https://leetcode-cn.com/problems/maximum-product-of-word-lengths

Given a string array words, find the maximum value of length(word[i]) * length(word[j]) such that the two words have no common letters. You can think of each word as containing only lowercase letters. Returns 0 if no such two words exist.

示例 1:
输入: ["abcw","baz","foo","bar","xtfn","abcdef"]
输出: 16 
解释: 这两个单词为 "abcw", "xtfn"。

示例 2:
输入: ["a","ab","abc","d","cd","bcd","abcd"]
输出: 4 
解释: 这两个单词为 "ab", "cd"。

problem solving ideas

This question can only be exhaustive, but the exhaustive objects need to satisfy each other without any repeated letters.

The idea of ​​this question is to judge the method of word groups without repeated letters. should map a letter into an array of length 26 . Then use another word to try to pass this filter condition, if it passes, calculate the product , and the rule of passing is that all characters of another word cannot encounter the coordinates set to 1.

(code omitted)

thinking exercise

The following content has nothing to do with the course study, it is used to practice the thinking model, if you are interested, you can explain it.

leecode-240. Searching two-dimensional matrices II

Link: https://leetcode-cn.com/problems/search-a-2d-matrix-ii

Write an efficient algorithm to search for a target value target in the mxn matrix matrix. This matrix has the following properties:

  • The elements of each row are arranged in ascending order from left to right.
  • The elements of each column are arranged in ascending order from top to bottom.
例如:
1 ,4 ,7 ,11,15
2 ,5 ,8 ,12,19
3 ,6 ,9 ,16,22
10,13,14,17,24
18,21,23,26,30  

problem solving ideas

First grasp the characteristics of this matrix: horizontal and vertical increments. Therefore, you can first grasp the point in the upper right corner and the point in the lower left corner.

Secondly, a rectangle of any size intercepted from the rectangle satisfies the characteristics of the matrix.

According to these two points, a recursive query rule can be designed, and two pointers are prepared during the query process, starting asynchronously from the upper right and lower left respectively (it can also be synchronized, these two do not conflict)

  • For the pointer starting from the upper right corner, if the target value is smaller than the value of the pointer, you want to move one space to the left, and if it is larger than the value, move down one space
  • For the pointer starting from the lower left corner, if the target value is greater than the value of the pointer, move one space to the right, and
    if it is smaller than the value, move up one space

Any one of these two pointers realizes that the target value is equal to the value at which it is located, and the recursion ends, and the query is completed.
(One pointer is also possible, but two pointers can make the performance more even)

(code omitted)

Leecode-979. Allocating Coins in a Binary Tree

Link: https://leetcode-cn.com/problems/distribute-coins-in-binary-tree

Given the root node root of a binary tree with N nodes, each node in the tree corresponds to node.val coins, and there are N coins in total.

In one move, we can select two adjacent nodes and move a coin from one of them to the other. (Movement can be from parent node to child node, or from child node to parent node.).

Returns the number of moves required to get only one coin on each node.

Example 4:
insert image description here

输入:[1,0,0,null,3]
输出:4

problem solving ideas

This topic needs to pay attention to a topic condition,The total number of nodes and the number of coins are equal to N

This question finds the minimum number of moves of the coin. In order to achieve this result, the movement of the coin must be one-way (because if it is two-way, it is better to only move the difference).

Then consider a small area . When a subtree itself has 5 nodes and only 3 coins, at least the difference of coins (2) should be provided from the outside. These coins must pass through the parent node of the subtree (Conversely, it is necessary to output the difference of coins).

Finally, according to this rule, the number of times each node needs to provide or receive coins to its parent node can be counted , and the sum of these times is the answer to this question.

(code omitted)

Leecode-430. Flat Multilevel Doubly Linked List

Link: https://leetcode-cn.com/problems/flatten-a-multilevel-doubly-linked-list

In a multi-level doubly linked list, in addition to pointing to the next node and the previous node pointer, it also has a sub-linked list pointer, which may point to a separate doubly linked list. These sublists may also have one or more children of their own, and so on, resulting in a multilevel data structure, as shown in the following example.

Given the head node at the first level of the list, please flatten the list so that all nodes appear in a single-level double-linked list.

Example of the original linked list:
insert image description here

Result linked list:
insert image description here

problem solving ideas

The requirement of this question is to convert the multi-level linked list into a single-level linked list.

Therefore, this question needs to traverse this linked list. When a branch occurs, the subsequent linked list needs to be connected to the end of this branched linked list.

So design a function, the input is the head node of the linked list, the return value is the tail node of the linked list (or return the head of the sub-linked list), when a branch occurs, input the head node of the branch into the function, and then find the tail node of the returned linked list Just connect it to the tail node.

(Simple recursive function, but the handling of pointer fields is cumbersome and easy to write wrong)

sample code


/*
// Definition for a Node.
class Node {
    public int val;
    public Node prev;
    public Node next;
    public Node child;
};
*/

class Solution {
    
    
    public Node flatten(Node head) {
    
    
        Node p = head,q,k;
		while(p!=null){
    
    
			if(p.child!=null){
    
    
				q = p.next;

				k = flatten(p.child);
				p.child = null;
                p.next = k;
                
                k.prev =p;
				while(k.next!=null)k= k.next;
				k.next = q;
				if(q!=null)q.prev = k;			
			}
			p = p.next;		
		}
		return head;
    }
}

leecode 863. All nodes with distance K in a binary tree

Link: https://leetcode-cn.com/problems/all-nodes-distance-k-in-binary-tree

Given a binary tree (with root node root), a target node target, and an integer value K.

Returns a list of the values ​​of all nodes at distance K from the target node target. Answers can be returned in any order.

Example 1:

输入:root = [3,5,1,6,2,0,8,null,null,7,4], target = 5, K = 2
输出:[7,4,1]
解释:
所求结点为与目标结点(值为 5)距离为 2 的结点,
值分别为 7,4,以及 1

注意,输入的 "root" 和 "target" 实际上是树上的结点。
上面的输入仅仅是对这些对象进行了序列化描述。

hint:

  • The given tree is non-empty.
  • Each node on the tree has a unique value 0 <= node.val <= 500 .
  • The target node target is a node on the tree.
  • 0 <= K <= 1000.

problem solving ideas

First of all, you need the target node, and consider multiple issues
(thinking needs to be structurally changed to simplify complex issues)
(There are two ways of structured thinking in this question) (
The first way is to discuss the downward query first, and then Discuss the way of upward query )
( The second method is to assume that K is 1 first, and the topology guess distance is the arrangement of nodes, and then gradually increase K to find the law )

Think about this first:

距离为1的时,需要向下找1和向上找1;
距离为2时,需要向下找2,和向上找2,和向上找1后再向下找1
......
距离为K时,向上找N后再向下找(K-N),N的取值为0~K。

Therefore, two functions are required, one is only responsible for finding all nodes with a (KN) distance downward, and the other is responsible for finding N-level parent nodes upward.

(It should be noted that if you turn up and then down, you must change a subtree, otherwise there will be repeated paths)

The final steps are:

  • The prepare function looks for nodes below the target node. Recursive search is required downwards, and the depth of recursion is distance K
  • Find the node above the target node. need to prepare a rule
  • First find a node with a distance of N up, and then change a subtree corresponding to the node to find a KN node down, and store the found node in the return set
  • Then let the value of N range from 0 to K (0 is included, which is equivalent to only looking down. When N is 0, there is no need to replace the subtree).

(The method I use when solving the problem is to execute method two first to understand the expansion of the model, and then use method one to design the way to find nodes. This question has nothing to do with the course).

(There are many solutions to this question, and those who are interested can design a different solution)

(The solution I designed is as follows:)

  • Prepare a function for traversing the binary tree, and always record and prepare the current node during the traversal process. When the target node is finally found, backtrack, and store each layer in the starting array in turn, only k, and record itself when storing Whether the subtree it is in is left or right.
  • Prepare a class that satisfies the encapsulation conditions required for the above array: contains nodes and left and right subtree dependencies
  • Prepare a search function, the input parameters are node and depth. When the conditions are met to obtain the node of the target depth, the corresponding node value is returned, and the returned value is stored in the result set
  • Traverse the starting array, select the search depth according to the subscript, and select the root node of another subtree as an input parameter according to the left and right subtree marks
  • Loop, return the result set after the loop ends

sample code

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *     int val;
 *     TreeNode left;
 *     TreeNode right;
 *     TreeNode(int x) { val = x; }
 * }
 */
class Solution {
    
    
    private int k;
    private int len;

    private Node [] startArr;

    private ArrayList <Integer> res;

    public List<Integer> distanceK(TreeNode root, TreeNode target, int k) {
    
    
        //int wantVal = target.val;
        len = 0;
        this.k = k;
        startArr = new Node [k+1];
        res = new ArrayList <Integer> ();

        if(!findTarget(root,target)){
    
    
            //System.out.println("---");
            return res;
        }
        //System.out.println(len);
        for(int i=0;i<len;i++){
    
    
            TreeNode start = startArr[i].node;
            if(i==0||i==k)
                findDeep(start,k-i);
            else{
    
    
                if(startArr[i].isLeft)
                    start = start.right;
                else
                    start = start.left;
                findDeep(start,k-i-1);
            }

        }

        return res;
    }

    private void findDeep(TreeNode root,int deep){
    
    
        if(root == null) return;
        if(deep <= 0){
    
    
            res.add(root.val);
            return;
        }

        deep--;
        findDeep(root.left,deep);
        findDeep(root.right,deep);
    }

    private Boolean findTarget (TreeNode root, TreeNode target){
    
    
        int wantVal = target.val;
        if(root.val == wantVal){
    
    
            //本身,此处的true无意义
            startArr[len++] = new Node(root,true); 
            return true;
        }

        //只可能来自一条分支
        if(root.left!=null){
    
    
            if(findTarget(root.left,target)){
    
    
                if(len<k+1)
                    startArr[len++] = new Node(root,true); 
                return true;
            }
        }
        if(root.right!=null){
    
    
            if(findTarget(root.right,target)){
    
    
                if(len<k+1)
                    startArr[len++] = new Node(root,false); 
                return true;
            }
        }
        return false;
    }

}
class Node{
    
    
    TreeNode node;
    Boolean isLeft;

    public Node(TreeNode node,Boolean isLeft){
    
    
        this.node = node;
        this.isLeft = isLeft;
    }
}

(nice, very fast)

执行用时:
13 ms
, 在所有 Java 提交中击败了
100.00%
的用户
内存消耗:
38.7 MB
, 在所有 Java 提交中击败了
22.00%
的用户

epilogue

This time, it took about 2 hours to follow the video to complete the basic part, and 0.5 hours to complete the notes. It took about
4 hours to follow the video to complete the exercises, and 9 hours to complete the problem-solving ideas and codes

The total length of the video is 5H, so this time it is 15.5H, 3 times time-consuming

Obviously, it takes a long time to solve the problem. I have skipped half of the code of the problem. It can be seen that I need to continue to practice.

The learning steps have been simplified to 4 steps by me, and I think I have completed this learning task with high quality.

But in the process, I tried to get rid of the current learning model and try other learning strategies, and finally found that the efficiency of learning is almost proportional to my own thinking time, and videos can help improve the efficiency of thinking. Therefore, we will continue to follow the current 4-step learning method.

(In addition, the reason why it takes a long time to solve the problem is that the thinking of the problem is not comprehensive, and the staged development relies too much on the debugging function. This may be due to the lack of IQ, and it is expected to be improved through training.)

ε≡٩(๑>₃<)۶ dedicated to learning

Guess you like

Origin blog.csdn.net/ex_xyz/article/details/118565411