力扣692. 前K个高频单词

https://leetcode.cn/problems/top-k-frequent-words/description/

方法一：优先队列

思路：对于前k个最大或最小最容易联想到之前学过的topk问题，用优先级队列解决。(升序大根堆，降序小根堆)本题要求按频率从高到低排序，所以采用小根堆。

哈希表中存的键值对映射关系插入到小根堆中，

(这里需要注意：要自己创建一个比较器Comparator，设置比较规则compare，规则是单词频率升序并且字母顺序降序)

如果小根堆的大小超过k个，就弹出堆顶元素(最小元素)，这样小根堆中剩下的就是k个最大的。然后用一个list(String)接收这k个元素直到小根堆弹空为止，这时list中存的单词是按出现频率升序并且字母顺序降序排的，再逆置list即可。

//方法1：小根堆
    class Solution {
    
    
        public List<String> topKFrequent(String[] words, int k) {
    
    
            HashMap<String,Integer> map=new HashMap<>();
            //用一个哈希表记录每一个字符串出现的频率
            for(int i=0;i<words.length;i++){
    
    
                String str=words[i];
                map.put(str,map.getOrDefault(str,0)+1);
            }
            //小堆里面存String降序，Integer升序，最后弹出堆顶就是符合题意的
            PriorityQueue<Map.Entry<String,Integer>> pq=new PriorityQueue<Map.Entry<String,Integer>> (new Comparator<Map.Entry<String,Integer>>(){
    
    
                public int compare(Map.Entry<String,Integer> entry1,Map.Entry<String,Integer> entry2){
    
    
                    return entry1.getValue()==entry2.getValue()?entry2.getKey().compareTo(entry1.getKey()):entry1.getValue()-entry2.getValue();
                }
            });
            List<String> list=new ArrayList<>();
            for(Map.Entry<String,Integer> entry:map.entrySet()){
    
    
                pq.add(entry);
                if(pq.size()>k){
    
    
                    pq.poll();
                }
            }
            while(!pq.isEmpty()){
    
    
                list.add(pq.poll().getKey());
            }
            Collections.reverse(list);
            return list;
        }
    }

复杂度分析

时间复杂度：O(l×n+m×l×logk)，其中 n表示给定字符串序列的长度，m 表示实际字符串种类数，l表示字符串的平均长度。我们需要 l ×n 的时间将字符串插入到哈希表中，以及每次插入元素到优先队列中都需要 l×logk 的时间，共需要插入 m次。

空间复杂度：O(l×(m+k))，其中 l 表示字符串的平均长度，m 表示实际字符串种类数。哈希表空间占用为 O(l×m)，优先队列空间占用为 O(l×k)。

方法二：哈希表+排序

思路：利用哈希表记录每一个字符串出现的频率，然后将哈希表中所有字符串进行排序，排序时，如果单词频率相同，按字母升序排，单词频率不同，按单词频率降序排。

最后返回list中前k个字符串。

//方法2：排序
    public List<String> topKFrequent(String[] words, int k) {
    
    
        Map<String, Integer> cnt = new HashMap<String, Integer>();
        for (String word : words) {
    
    
            cnt.put(word, cnt.getOrDefault(word, 0) + 1);
        }
        List<String> rec = new ArrayList<String>();
        for (Map.Entry<String, Integer> entry : cnt.entrySet()) {
    
    
            rec.add(entry.getKey());
        }
        Collections.sort(rec, new Comparator<String>() {
    
    
            public int compare(String word1, String word2) {
    
    
                return cnt.get(word1) == cnt.get(word2) ? word1.compareTo(word2) : cnt.get(word2) - cnt.get(word1);
            }
        });
        return rec.subList(0, k);
    }

    //匿名内部类，Map.Entry获取键值对
}

复杂度分析

时间复杂度：O(l×n+l×m×logm)，其中 n 表示给定字符串序列的长度，l 表示字符串的平均长度，m 表示实际字符串种类数。我们需要 l×n 的时间将字符串插入到哈希表中，以及l×m×logm 的时间完成字符串比较（最坏情况下所有字符串出现频率都相同，我们需要将它们两两比较）。(mlogm用快排或归并按单词频率排，如果频率相同，内部嵌套循环根据字母顺序排)

空间复杂度：O(l×m)，其中 l 表示字符串的平均长度，m 表示实际字符串种类数。哈希表和生成的排序数组空间占用均为 O(l×m)。

总结：方法一和方法二都是先把单词和它出现的频率保存到HashMap中，然后方法一用堆排序，方法二用Collections.sort()。排序中都需要要自己创建一个比较器Comparator，设置比较规则compare。

哈希表模拟实现

// key-value 模型
public class HashBucket {
    
    
    private static class Node {
    
    
        private int key;
        private int value;
        Node next;
        public Node(int key, int value) {
    
    
            this.key = key;
            this.value = value;
        }
    }
    //usedSize是数组+链表这个结构里面所有数据元素的个数
    private Node[]  array;
    private int size;   // 当前的数据个数
    private static final double LOAD_FACTOR = 0.75;
    private static final int DEFAULT_SIZE = 8;//默认桶的大小
    //放元素
    //从链表第一个结点开始往后找，找到空为止，找有没有key相同的，有就更新value。
    public void put(int key, int value) {
    
    
        int index=key%array.length;
        Node cur=array[index];
        while(cur!=null)
        {
    
    
            if(cur.key==key){
    
    
                cur.value=value;
                return;
            }
            cur=cur.next;
        }
        //JDK1.7是头插法，JDK1.8开始是尾插法
        //这里采用头插
        Node node=new Node(key,value);
        node.next=array[index];
        array[index]=node;
        size++;
        //检查一下有没有超过负载量
        if(loadFactor()>0.75){
    
    
            //扩容
            resize();
        }
    }
    //扩容需要重新计算下标
    private void resize() {
    
    
        Node[] newArray=new Node[array.length*2];
        for(int i=0;i<array.length;i++){
    
    
            Node cur=array[i];
            while(cur!=null){
    
    
                Node curNext=cur.next;
                int index=cur.key% newArray.length;
                cur.next=newArray[index];
                newArray[index]=cur;
                cur=curNext;
            }
        }
        array=newArray;
    }
    private double loadFactor() {
    
    
        return size * 1.0 / array.length;
    }
    public HashBucket() {
    
    
        array=new Node[8];
    }
    //查找
    public int get(int key) {
    
    
        int index=key%array.length;
        Node cur=array[index];
        while(cur!=null){
    
    
            if(cur.key==key){
    
    
                return cur.value;
            }
            cur=cur.next;
        }
        return -1;
    }
}

面试问题1：

HashMap的扩容问题？

注意要重新计算每个元素的下标。

哈希表模拟(泛型)

public class HashBuck2 <K,V>{
    
    
    static class Node<K,V>{
    
    
        public K key;
        public  V val;
        public Node<K,V> next;

        public Node(K key, V val) {
    
    
            this.key = key;
            this.val = val;
        }
    }

    public Node<K,V>[] array=(Node<K,V>[])new Node[10];
    public int usedSize;
    public void put(K key,V val){
    
    
        //泛型类型不能直接%长度，可以计算他的哈希码转化成一个整数，引出哈希值,hashcode在object里
        int hash=key.hashCode();
        int index=hash%array.length;
        Node<K,V> cur=array[index];
        while(cur!=null)
        {
    
    
            if(cur.key.equals(key)){
    
    
                cur.val=val;
                return;
            }
            cur=cur.next;
        }
        Node<K,V> node=new Node<>(key,val);
        node.next=array[index];
        array[index]=node;
        usedSize++;
    }

    public V get(K key) {
    
    
        int hash=key.hashCode();
        int index=hash%array.length;
        Node<K,V> cur=array[index];
        while(cur!=null){
    
    
            if(cur.key.equals(key)){
    
    
                return cur.val;
            }
            cur=cur.next;
        }
        return null;
    }
}