缓存算法（内存回收算法）- LRU、FIFO、LFU

题目链接：https://oj.leetcode.com/problems/lru-cache/

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and set.

get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
set(key, value) - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

题目大意：为LRU Cache设计一个数据结构，它支持两个操作：

　　　1）get(key)：如果key在cache中，则返回对应的value值，否则返回-1

　　　2）set(key,value):如果key不在cache中，则将该(key,value)插入cache中（注意，如果cache已满，则必须把最近最久未使用的元素从cache中删除）；如果key在cache中，则重置value的值。

解题思路：题目让设计一个LRU Cache，即根据LRU算法设计一个缓存。在这之前需要弄清楚LRU算法的核心思想，LRU全称是Least

Recently Used，即最近最久未使用的意思。在操作系统的内存管理中，有一类很重要的算法就是内存页面置换算法（包括FIFO，LRU,LFU等几种常见页面置换算法）。事实上，Cache算法和内存页面置换算法的核心思想是一样的：都是在给定一个限定大小的空间的前提下，设计一个原则如何来更新和访问其中的元素。下面说一下LRU算法的核心思想，LRU算法的设计原则是：如果一个数据在最近一段时间没有被访问到，那么在将来它被访问的可能性也很小。也就是说，当限定的空间已存满数据时，应当把最久没有被访问到的数据淘汰。

　　而用什么数据结构来实现LRU算法呢？可能大多数人都会想到：用一个数组来存储数据，给每一个数据项标记一个访问时间戳，每次插入新数据项的时候，先把数组中存在的数据项的时间戳自增，并将新数据项的时间戳置为0并插入到数组中。每次访问数组中的数据项的时候，将被访问的数据项的时间戳置为0。当数组空间已满时，将时间戳最大的数据项淘汰。

　　这种实现思路很简单，但是有什么缺陷呢？需要不停地维护数据项的访问时间戳，另外，在插入数据、删除数据以及访问数据时，时间复杂度都是O(n)。

　　那么有没有更好的实现办法呢？

　　那就是利用链表和hashmap。当需要插入新的数据项的时候，如果新数据项在链表中存在（一般称为命中），则把该节点移到链表头部，如果不存在，则新建一个节点，放到链表头部，若缓存满了，则把链表最后一个节点删除即可。在访问数据的时候，如果数据项在链表中存在，则把该节点移到链表头部，否则返回-1。这样一来在链表尾部的节点就是最近最久未访问的数据项。

　　总结一下：根据题目的要求，LRU Cache具备的操作：

　　1）set(key,value)：如果key在hashmap中存在，则先重置对应的value值，然后获取对应的节点cur，将cur节点从链表删除，并移动到链表的头部；若果key在hashmap不存在，则新建一个节点，并将节点放到链表的头部。当Cache存满的时候，将链表最后一个节点删除即可。

　　2）get(key)：如果key在hashmap中存在，则把对应的节点放到链表头部，并返回对应的value值；如果不存在，则返回-1。

　　仔细分析一下，如果在这地方利用单链表和hashmap，在set和get中，都有一个相同的操作就是将在命中的节点移到链表头部，如果按照传统的遍历办法来删除节点可以达到题目的要求么？第二，在删除链表末尾节点的时候，必须遍历链表，然后将末尾节点删除，这个能达到题目的时间要求么？

　　试一下便知结果：

　　第一个版本实现：

 
          #include <iostream> 
         
          #include <map> 
         
          #include <algorithm> 
         
          using  
          namespace  
          std; 
         
          struct  
          Node 
         
          { 
         
          int  
          key; 
         
          int  
          value; 
         
          Node *next; 
         
          }; 
         
          class  
          LRUCache{ 
         
          private 
          : 
         
          int  
          count; 
         
          int  
          size ; 
         
          map< 
          int 
          ,Node *> mp; 
         
          Node *cacheList; 
         
          public 
          : 
         
          LRUCache( 
          int  
          capacity) { 
         
          size = capacity; 
         
          cacheList = NULL; 
         
          count = 0; 
         
          } 
         
          int  
          get( 
          int  
          key) { 
         
          if 
          (cacheList==NULL) 
         
          return  
          -1; 
         
          map< 
          int 
          ,Node *>::iterator it=mp.find(key); 
         
          if 
          (it==mp.end())   
          //如果在Cache中不存在该key， 则返回-1 
         
          { 
         
          return  
          -1;  
         
          } 
         
          else 
         
          { 
         
          Node *p = it->second;    
         
          pushFront(p);     
          //将节点p置于链表头部 
         
          } 
         
          return  
          cacheList->value;   
         
          } 
         
          void  
          set( 
          int  
          key,  
          int  
          value) { 
         
          if 
          (cacheList==NULL)    
          //如果链表为空，直接放在链表头部 
         
          { 
         
          cacheList = (Node *) 
          malloc 
          ( 
          sizeof 
          (Node)); 
         
          cacheList->key = key; 
         
          cacheList->value = value; 
         
          cacheList->next = NULL; 
         
          mp[key] = cacheList; 
         
          count++; 
         
          } 
         
          else    
          //否则，在map中查找 
         
          { 
         
          map< 
          int 
          ,Node *>::iterator it=mp.find(key); 
         
          if 
          (it==mp.end())    
          //没有命中 
         
          { 
         
          if 
          (count == size)   
          //cache满了 
         
          { 
         
          Node * p = cacheList; 
         
          Node *pre = p; 
         
          while 
          (p->next!=NULL) 
         
          { 
         
          pre = p; 
         
          p= p->next;  
         
          } 
         
          mp.erase(p->key); 
         
          count--; 
         
          if 
          (pre==p)          
          //说明只有一个节点 
         
          p=NULL; 
         
          else 
         
          pre->next = NULL; 
         
          free 
          (p); 
         
          } 
         
          Node * newNode = (Node *) 
          malloc 
          ( 
          sizeof 
          (Node));  
         
          newNode->key = key; 
         
          newNode->value = value; 
         
          newNode->next = cacheList; 
         
          cacheList = newNode; 
         
          mp[key] = cacheList; 
         
          count++;    
         
          } 
         
          else 
         
          { 
         
          Node *p = it->second;    
         
          p->value = value; 
         
          pushFront(p); 
         
          } 
         
          } 
         
          } 
         
          void  
          pushFront(Node *cur)    
          //单链表删除节点,并将节点移动链表头部，O(n) 
         
          { 
         
          if 
          (count==1) 
         
          return 
          ; 
         
          if 
          (cur==cacheList) 
         
          return 
          ; 
         
          Node *p = cacheList; 
         
          while 
          (p->next!=cur) 
         
          { 
         
          p=p->next;       
         
          } 
         
          p->next = cur->next;    
          //删除cur节点 
         
          cur->next = cacheList; 
         
          cacheList = cur; 
         
          } 
         
          void  
          printCache(){ 
         
          Node *p = cacheList; 
         
          while 
          (p!=NULL) 
         
          { 
         
          cout<<p->key<< 
          " " 
          ; 
         
          p=p->next; 
         
          } 
         
          cout<<endl; 
         
          } 
         
          }; 
         
          int  
          main( 
          void 
          ) 
         
          { 
         
          /*LRUCache cache(3); 
         
          cache.set(2,10); 
         
          cache.printCache(); 
         
          cache.set(1,11); 
         
          cache.printCache(); 
         
          cache.set(2,12); 
         
          cache.printCache(); 
         
          cache.set(1,13); 
         
          cache.printCache(); 
         
          cache.set(2,14); 
         
          cache.printCache(); 
         
          cache.set(3,15); 
         
          cache.printCache(); 
         
          cache.set(4,100); 
         
          cache.printCache(); 
         
          cout<<cache.get(2)<<endl; 
         
          cache.printCache();*/ 
         
          LRUCache cache(2); 
         
          cout<<cache.get(2)<<endl; 
         
          cache.set(2,6); 
         
          cache.printCache(); 
         
          cout<<cache.get(1)<<endl; 
         
          cache.set(1,5); 
         
          cache.printCache(); 
         
          cache.set(1,2); 
         
          cache.printCache(); 
         
          cout<<cache.get(1)<<endl; 
         
          cout<<cache.get(2)<<endl; 
         
          return  
          0; 
         
          }

　　提交之后，提示超时：

　　因此要对算法进行改进，如果把pushFront时间复杂度改进为O(1)的话是不是就能达到要求呢？

　　但是 在已知要删除的节点的情况下，如何在O(1)时间复杂度内删除节点？

　　如果知道当前节点的前驱节点的话，则在O(1)时间复杂度内删除节点是很容易的。而在无法获取当前节点的前驱节点的情况下，能够实现么？对，可以实现的。

　　具体的可以参照这几篇博文：

　　http://www.cnblogs.com/xwdreamer/archive/2012/04/26/2472102.html

　　http://www.nowamagic.net/librarys/veda/detail/261

　　原理：假如要删除的节点是cur，通过cur可以知道cur节点的后继节点curNext，如果交换cur节点和curNext节点的数据域，然后删除curNext节点（curNext节点是很好删除地），此时便在O(1)时间复杂度内完成了cur节点的删除。

　　改进版本1：

 
          void  
          pushFront(Node *cur)   
          //单链表删除节点,并将节点移动链表头部，O(1) 
         
          { 
         
          if 
          (count==1) 
         
          return 
          ; 
         
          //先删除cur节点 ，再将cur节点移到链表头部 
         
          Node *curNext = cur->next; 
         
          if 
          (curNext==NULL)   
          //如果是最后一个节点 
         
          { 
         
          Node * p = cacheList; 
         
          while 
          (p->next!=cur) 
         
          { 
         
          p=p->next; 
         
          } 
         
          p->next = NULL;  
         
          cur->next = cacheList; 
         
          cacheList = cur; 
         
          } 
         
          else     
          //如果不是最后一个节点 
         
          { 
         
          cur->next = curNext->next;    
         
          int  
          tempKey = cur->key; 
         
          int  
          tempValue = cur->value; 
         
          cur->key = curNext->key; 
         
          cur->value = curNext->value; 
         
          curNext->key = tempKey; 
         
          curNext->value = tempValue; 
         
          curNext->next = cacheList; 
         
          cacheList  = curNext; 
         
          mp[curNext->key] = curNext; 
         
          mp[cur->key] = cur; 
         
          } 
         
          }

　　提交之后，提示Accepted，耗时492ms，达到要求。

　　有没有更好的实现办法，使得删除末尾节点的复杂度也在O(1)？那就是利用双向链表，并提供head指针和tail指针，这样一来，所有的操作都是O(1)时间复杂度。

　　改进版本2：

 
          #include <iostream> 
         
          #include <map> 
         
          #include <algorithm> 
         
          using  
          namespace  
          std; 
         
          struct  
          Node 
         
          { 
         
          int  
          key; 
         
          int  
          value; 
         
          Node *pre; 
         
          Node *next; 
         
          }; 
         
          class  
          LRUCache{ 
         
          private 
          : 
         
          int  
          count; 
         
          int  
          size ; 
         
          map< 
          int 
          ,Node *> mp; 
         
          Node *cacheHead; 
         
          Node *cacheTail; 
         
          public 
          : 
         
          LRUCache( 
          int  
          capacity) { 
         
          size = capacity; 
         
          cacheHead = NULL; 
         
          cacheTail = NULL; 
         
          count = 0; 
         
          } 
         
          int  
          get( 
          int  
          key) { 
         
          if 
          (cacheHead==NULL) 
         
          return  
          -1; 
         
          map< 
          int 
          ,Node *>::iterator it=mp.find(key); 
         
          if 
          (it==mp.end())   
          //如果在Cache中不存在该key， 则返回-1 
         
          { 
         
          return  
          -1;  
         
          } 
         
          else 
         
          { 
         
          Node *p = it->second;    
         
          pushFront(p);     
          //将节点p置于链表头部 
         
          } 
         
          return  
          cacheHead->value;   
         
          } 
         
          void  
          set( 
          int  
          key,  
          int  
          value) { 
         
          if 
          (cacheHead==NULL)    
          //如果链表为空，直接放在链表头部 
         
          { 
         
          cacheHead = (Node *) 
          malloc 
          ( 
          sizeof 
          (Node)); 
         
          cacheHead->key = key; 
         
          cacheHead->value = value; 
         
          cacheHead->pre = NULL; 
         
          cacheHead->next = NULL; 
         
          mp[key] = cacheHead; 
         
          cacheTail = cacheHead; 
         
          count++; 
         
          } 
         
          else    
          //否则，在map中查找 
         
          { 
         
          map< 
          int 
          ,Node *>::iterator it=mp.find(key); 
         
          if 
          (it==mp.end())    
          //没有命中 
         
          { 
         
          if 
          (count == size)   
          //cache满了 
         
          { 
         
          if 
          (cacheHead==cacheTail&&cacheHead!=NULL)   
          //只有一个节点 
         
          { 
         
          mp.erase(cacheHead->key); 
         
          cacheHead->key = key; 
         
          cacheHead->value = value; 
         
          mp[key] = cacheHead; 
         
          } 
         
          else 
         
          { 
         
          Node * p =cacheTail; 
         
          cacheTail->pre->next = cacheTail->next;   
         
          cacheTail = cacheTail->pre; 
         
          mp.erase(p->key); 
         
          p->key= key; 
         
          p->value = value; 
         
          p->next = cacheHead; 
         
          p->pre = cacheHead->pre; 
         
          cacheHead->pre = p; 
         
          cacheHead = p; 
         
          mp[cacheHead->key] = cacheHead; 
         
          } 
         
          } 
         
          else 
         
          { 
         
          Node * p = (Node *) 
          malloc 
          ( 
          sizeof 
          (Node)); 
         
          p->key = key; 
         
          p->value = value; 
         
          p->next = cacheHead; 
         
          p->pre = NULL; 
         
          cacheHead->pre = p; 
         
          cacheHead = p; 
         
          mp[cacheHead->key] = cacheHead; 
         
          count++; 
         
          } 
         
          } 
         
          else 
         
          { 
         
          Node *p = it->second;    
         
          p->value = value; 
         
          pushFront(p); 
         
          } 
         
          } 
         
          } 
         
          void  
          pushFront(Node *cur)    
          //双向链表删除节点,并将节点移动链表头部，O(1) 
         
          { 
         
          if 
          (count==1) 
         
          return 
          ; 
         
          if 
          (cur==cacheHead) 
         
          return 
          ; 
         
          if 
          (cur==cacheTail) 
         
          { 
         
          cacheTail = cur->pre; 
         
          } 
         
          cur->pre->next = cur->next;    
          //删除节点 
         
          if 
          (cur->next!=NULL) 
         
          cur->next->pre = cur->pre; 
         
          cur->next = cacheHead; 
         
          cur->pre = NULL; 
         
          cacheHead->pre = cur; 
         
          cacheHead = cur;    
         
          } 
         
          void  
          printCache(){ 
         
          Node *p = cacheHead; 
         
          while 
          (p!=NULL) 
         
          { 
         
          cout<<p->key<< 
          " " 
          ; 
         
          p=p->next; 
         
          } 
         
          cout<<endl; 
         
          } 
         
          }; 
         
          int  
          main( 
          void 
          ) 
         
          { 
         
          LRUCache cache(3); 
         
          cache.set(1,1); 
         
          //cache.printCache(); 
         
          cache.set(2,2); 
         
          //cache.printCache(); 
         
          cache.set(3,3); 
         
          cache.printCache(); 
         
          cache.set(4,4); 
         
          cache.printCache(); 
         
          cout<<cache.get(4)<<endl; 
         
          cache.printCache(); 
         
          cout<<cache.get(3)<<endl; 
         
          cache.printCache(); 
         
          cout<<cache.get(2)<<endl; 
         
          cache.printCache(); 
         
          cout<<cache.get(1)<<endl; 
         
          cache.printCache(); 
         
          cache.set(5,5); 
         
          cache.printCache(); 
         
          cout<<cache.get(1)<<endl; 
         
          cout<<cache.get(2)<<endl; 
         
          cout<<cache.get(3)<<endl; 
         
          cout<<cache.get(4)<<endl; 
         
          cout<<cache.get(5)<<endl; 
         
          return  
          0; 
         
          }

　　提交测试结果：

　　可以发现，效率有进一步的提升。

　　其实在STL中的list就是一个双向链表，如果希望代码简短点，可以用list来实现：

　　具体实现：

 
          #include <iostream> 
         
          #include <map> 
         
          #include <algorithm> 
         
          #include <list> 
         
          using  
          namespace  
          std; 
         
          struct  
          Node 
         
          { 
         
          int  
          key; 
         
          int  
          value; 
         
          }; 
         
          class  
          LRUCache{ 
         
          private 
          : 
         
          int  
          maxSize ; 
         
          list<Node> cacheList; 
         
          map< 
          int 
          , list<Node>::iterator > mp; 
         
          public 
          : 
         
          LRUCache( 
          int  
          capacity) { 
         
          maxSize = capacity; 
         
          } 
         
          int  
          get( 
          int  
          key) { 
         
          map< 
          int 
          , list<Node>::iterator >::iterator it = mp.find(key); 
         
          if 
          (it==mp.end())       
          //没有命中 
         
          { 
         
          return  
          -1; 
         
          } 
         
          else   
          //在cache中命中了 
         
          { 
         
          list<Node>::iterator listIt = mp[key]; 
         
          Node newNode; 
         
          newNode.key = key; 
         
          newNode.value = listIt->value; 
         
          cacheList.erase(listIt);                
          //先删除命中的节点       
         
          cacheList.push_front(newNode);    
          //将命中的节点放到链表头部 
         
          mp[key] = cacheList.begin(); 
         
          } 
         
          return  
          cacheList.begin()->value; 
         
          } 
         
          void  
          set( 
          int  
          key,  
          int  
          value) { 
         
          map< 
          int 
          , list<Node>::iterator >::iterator it = mp.find(key); 
         
          if 
          (it==mp.end())    
          //没有命中 
         
          { 
         
          if 
          (cacheList.size()==maxSize)   
          //cache满了 
         
          { 
         
          mp.erase(cacheList.back().key);     
         
          cacheList.pop_back(); 
         
          } 
         
          Node newNode; 
         
          newNode.key = key; 
         
          newNode.value = value; 
         
          cacheList.push_front(newNode); 
         
          mp[key] = cacheList.begin(); 
         
          } 
         
          else   
          //命中 
         
          { 
         
          list<Node>::iterator listIt = mp[key]; 
         
          cacheList.erase(listIt);                
          //先删除命中的节点           
         
          Node newNode; 
         
          newNode.key = key; 
         
          newNode.value = value; 
         
          cacheList.push_front(newNode);    
          //将命中的节点放到链表头部 
         
          mp[key] = cacheList.begin(); 
         
          } 
         
          } 
         
          }; 
         
          int  
          main( 
          void 
          ) 
         
          { 
         
          LRUCache cache(3); 
         
          cache.set(1,1); 
         
          cache.set(2,2); 
         
          cache.set(3,3); 
         
          cache.set(4,4); 
         
          cout<<cache.get(4)<<endl; 
         
          cout<<cache.get(3)<<endl; 
         
          cout<<cache.get(2)<<endl; 
         
          cout<<cache.get(1)<<endl; 
         
          cache.set(5,5); 
         
          cout<<cache.get(1)<<endl; 
         
          cout<<cache.get(2)<<endl; 
         
          cout<<cache.get(3)<<endl; 
         
          cout<<cache.get(4)<<endl; 
         
          cout<<cache.get(5)<<endl; 
         
          return  
          0; 
         
          }

   作者： 
  海子 
 
   出处： 
  http://www.cnblogs.com/dolphin0520/ 
 
   本博客中未标明转载的文章归作者 
  海子和博客园共有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文连接，否则保留追究法律责任的权利。

xxxxxxxxxx

在前一篇文章中通过leetcode的一道题目了解了LRU算法的具体设计思路，下面继续来探讨一下另外两种常见的Cache算法：FIFO、LFU

1.FIFO算法

　　FIFO（First in First out），先进先出。其实在操作系统的设计理念中很多地方都利用到了先进先出的思想，比如作业调度（先来先服务），为什么这个原则在很多地方都会用到呢？因为这个原则简单、且符合人们的惯性思维，具备公平性，并且实现起来简单，直接使用数据结构中的队列即可实现。

　　在FIFO Cache设计中，核心原则就是：如果一个数据最先进入缓存中，则应该最早淘汰掉。也就是说，当缓存满的时候，应当把最先进入缓存的数据给淘汰掉。在FIFO Cache中应该支持以下操作;

　　get(key)：如果Cache中存在该key，则返回对应的value值，否则，返回-1；

　　set(key,value)：如果Cache中存在该key，则重置value值；如果不存在该key，则将该key插入到到Cache中，若Cache已满，则淘汰最早进入Cache的数据。

　　举个例子：假如Cache大小为3，访问数据序列为set(1,1),set(2,2),set(3,3),set(4,4),get(2),set(5,5)

　　则Cache中的数据变化为：

　　(1,1) set(1,1)

　　(1,1) (2,2) set(2,2)

　　(1,1) (2,2) (3,3) set(3,3)

　　(2,2) (3,3) (4,4) set(4,4)

　　(2,2) (3,3) (4,4) get(2)

　　(3,3) (4,4) (5,5) set(5,5)

　　那么利用什么数据结构来实现呢？

　　下面提供一种实现思路：

　　利用一个双向链表保存数据，当来了新的数据之后便添加到链表末尾，如果Cache存满数据，则把链表头部数据删除，然后把新的数据添加到链表末尾。在访问数据的时候，如果在Cache中存在该数据的话，则返回对应的value值；否则返回-1。如果想提高访问效率，可以利用hashmap来保存每个key在链表中对应的位置。

2.LFU算法

　　LFU（Least Frequently Used）最近最少使用算法。它是基于“如果一个数据在最近一段时间内使用次数很少，那么在将来一段时间内被使用的可能性也很小”的思路。

　　注意LFU和LRU算法的不同之处，LRU的淘汰规则是基于访问时间，而LFU是基于访问次数的。举个简单的例子：

　　假设缓存大小为3，数据访问序列为set(2,2),set(1,1),get(2),get(1),get(2),set(3,3),set(4,4)，

　　则在set(4,4)时对于LFU算法应该淘汰(3,3)，而LRU应该淘汰(1,1)。

　　那么LFU Cache应该支持的操作为：

　　get(key)：如果Cache中存在该key，则返回对应的value值，否则，返回-1；

　　set(key,value)：如果Cache中存在该key，则重置value值；如果不存在该key，则将该key插入到到Cache中，若Cache已满，则淘汰最少访问的数据。

　　为了能够淘汰最少使用的数据，因此LFU算法最简单的一种设计思路就是利用一个数组存储数据项，用hashmap存储每个数据项在数组中对应的位置，然后为每个数据项设计一个访问频次，当数据项被命中时，访问频次自增，在淘汰的时候淘汰访问频次最少的数据。这样一来的话，在插入数据和访问数据的时候都能达到O(1)的时间复杂度，在淘汰数据的时候，通过选择算法得到应该淘汰的数据项在数组中的索引，并将该索引位置的内容替换为新来的数据内容即可，这样的话，淘汰数据的操作时间复杂度为O(n)。

　　另外还有一种实现思路就是利用小顶堆+hashmap，小顶堆插入、删除操作都能达到O(logn)时间复杂度，因此效率相比第一种实现方法更加高效。

　　如果哪位朋友有更高效的实现方式（比如O(1)时间复杂度），不妨探讨一下，不胜感激。

3.LRU算法

　　LRU算法的原理以及实现在前一篇博文中已经谈到，在此不进行赘述：

　　http://www.cnblogs.com/dolphin0520/p/3741519.html

　　参考链接：http://blog.csdn.net/hexinuaa/article/details/6630384

　　　　　　 http://blog.csdn.net/beiyetengqing/article/details/7855933

　　　　　　　http://outofmemory.cn/wr/?u=http%3A%2F%2Fblog.csdn.net%2Fyunhua_lee%2Farticle%2Fdetails%2F7648549

　　　　　　　http://blog.csdn.net/alexander_xfl/article/details/12993565

     作者： 
    海子 
   
     出处： 
    http://www.cnblogs.com/dolphin0520/ 
   
     本博客中未标明转载的文章归作者 
    海子和博客园共有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文连接，否则保留追究法律责任的权利。