LRU algorithm principle to resolve

LRU is Least Recently Usedthe acronym for the least recently used page replacement algorithm commonly used in, for the virtual page storage management services.

Modern operating systems provide a main memory abstraction 虚拟内存, to better manage main memory. He is the main memory as a cache memory address space on disk, save only the active area in the main memory, and transfer data back and forth between main memory and disk as needed. 虚拟内存Is organized on the disk array is stored in the N consecutive bytes, each byte has a unique virtual address, as an index of the array. 虚拟内存Is divided into fixed size blocks 虚拟页(Virtual Page,VP), these blocks as a transmission unit between the main memory and disk. Similarly, the physical memory is divided into 物理页(Physical Page,PP).

虚拟内存Used 页表to record and determines a 虚拟页whether the cached in physical memory:

As shown above, when the CPU access 虚拟页VP3, the found VP3 not cached in physical memory, this is called 缺页, now need to VP3 copied from disk into physical memory, but before that, in order to maintain the original size of the space needs to be selected in a physical memory 牺牲页, copy it to disk, this is called 交换or 页面调度, in the figure 牺牲页is VP4 . The page-out which can achieve the purpose of mobilizing as little as possible? Best to swap out each page is a page for all the latest memory will be used - it can maximize the exchange postpone page, this algorithm, known as the ideal page replacement algorithm, but this algorithm is difficult to perfect achieve.

To try to reduce the gap between the ideal algorithm, resulting in a variety of sophisticated algorithms, LRUthe algorithm is one of them.

LRU principle

LRU algorithm design principle is: if in the most recent data has not been accessed, then the possibility is very small in the future it is accessed. That is, when the space defined data becomes full, the oldest should not have access to the data eliminated.

The LRU principles and Redis implemented , assuming that the system is assigned a physical block 3 of the process, the process runs to the page is 70120304, the physical block at the beginning are three empty, then LRUthe algorithm is It operates as follows:

LRU algorithm to achieve doubly linked lists and hash tables based

If you want to implement own LRUalgorithm, you can use a hash table plus two-way linked list implementation:

Design idea is to use a hash table storage Key , a node is in the list, the value stored in the node, the node order is recorded doubly linked list, the head to the nearest access point.

LRUAlgorithms There are two basic operations:

  • get(key): Query node corresponding key, if the key is present, the node moves to the head of the linked list.
  • set(key, value): Set the node corresponding to the key value. If the key does not exist, the new node is placed at the beginning of the list. If the excessive length of the list, the last node will be removed in the tail. If the node exists, the update value of the node, while the node into the list head.

LRU cache mechanism

leetcodeThere are about on a LRU缓存机制topic:

Use your disposal data structures, design and implement an LRU (least recently used) cache mechanism. It should support the following: Obtain the data and write data to get put.

Get data get (key) - If the key (key) is present in the cache, the acquisition value of the key (always positive), otherwise -1. Write data put (key, value) - If the key is not present, the value of the data which is written. When the cache capacity limit is reached, it should delete the least recently used data values ​​before writing new data to make room for new data value.

Advanced:

If you can do both operations in O (1) time complexity?

Example:

LRUCache cache = new LRUCache( 2 /* 缓存容量 */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // 返回  1
cache.put(3, 3);    // 该操作会使得密钥 2 作废
cache.get(2);       // 返回 -1 (未找到)
cache.put(4, 4);    // 该操作会使得密钥 1 作废
cache.get(1);       // 返回 -1 (未找到)
cache.get(3);       // 返回  3
cache.get(4);       // 返回  4

We can own two-way linked list, you can also use ready-made data structure, pythonthe data structure OrderedDictis an ordered hash table, you can remember the order of addition of key hash table, while achieving the equivalent of a doubly linked list and hash table . OrderedDictThe latest data is placed at the end of:

In [35]: from collections import OrderedDict

In [36]: lru = OrderedDict()

In [37]: lru[1] = 1

In [38]: lru[2] = 2

In [39]: lru
Out[39]: OrderedDict([(1, 1), (2, 2)])

In [40]: lru.popitem()
Out[40]: (2, 2)

OrderedDictThere are two important ways:

  • popitem(last=True): Returns a key-value pair, when the last = True, according to LIFOthe order, or in accordance with the FIFOorder.
  • move_to_end(key, last=True): The key is moved to either end of the existing dictionary ordered. If the last True (the default) will move to the last element; if the last element to False will move to the beginning.

When you delete data, you can use popitem(last=False)the beginning of the recent visit of key-value pairs are not deleted. When accessing data or set, using move_to_end(key, last=True)the key-value to move to the end.

Code:

from collections import OrderedDict


class LRUCache:
    def __init__(self, capacity: int):
        self.lru = OrderedDict()
        self.capacity = capacity
        
    def get(self, key: int) -> int:
        self._update(key)
        return self.lru.get(key, -1)
        
    def put(self, key: int, value: int) -> None:
        self._update(key)
        self.lru[key] = value
        if len(self.lru) > self.capacity:
            self.lru.popitem(False)
         
    def _update(self, key: int):
        if key in self.lru:
            self.lru.move_to_end(key)

OrderedDict source code analysis

 OrderedDictIt is actually a hash table with a doubly linked list implementation:

class OrderedDict(dict):
    'Dictionary that remembers insertion order'
    # An inherited dict maps keys to values.
    # The inherited dict provides __getitem__, __len__, __contains__, and get.
    # The remaining methods are order-aware.
    # Big-O running times for all methods are the same as regular dictionaries.

    # The internal self.__map dict maps keys to links in a doubly linked list.
    # The circular doubly linked list starts and ends with a sentinel element.
    # The sentinel element never gets deleted (this simplifies the algorithm).
    # The sentinel is in self.__hardroot with a weakref proxy in self.__root.
    # The prev links are weakref proxies (to prevent circular references).
    # Individual links are kept alive by the hard reference in self.__map.
    # Those hard references disappear when a key is deleted from an OrderedDict.

    def __init__(*args, **kwds):
        '''Initialize an ordered dictionary.  The signature is the same as
        regular dictionaries.  Keyword argument order is preserved.
        '''
        if not args:
            raise TypeError("descriptor '__init__' of 'OrderedDict' object "
                            "needs an argument")
        self, *args = args
        if len(args) > 1:
            raise TypeError('expected at most 1 arguments, got %d' % len(args))
        try:
            self.__root
        except AttributeError:
            self.__hardroot = _Link()
            self.__root = root = _proxy(self.__hardroot)
            root.prev = root.next = root
            self.__map = {}
        self.__update(*args, **kwds)

    def __setitem__(self, key, value,
                    dict_setitem=dict.__setitem__, proxy=_proxy, Link=_Link):
        '.__ of setitem __ (i, y) <==> from [i] = y'
        # Setting a new item creates a new link at the end of the linked list,
        # and the inherited dictionary is updated with the new key/value pair.
        if key not in self:
            self.__map[key] = link = Link()
            root = self.__root
            last = root.prev
            link.prev, link.next, link.key = last, root, key
            last.next = link
            root.prev = proxy(link)
        dict_setitem(self, key, value)

 Seen from the source, OrderedDictuse self.__map = {}as a hash table, which holds keythe list of node Link()keys pairs self.__map[key] = link = Link():

class _Link(object):
    __slots__ = 'prev', 'next', 'key', '__weakref__'

  Node Link()stored in the front pointer points to a node prev, the pointer points to a node next, and keyvalue.

  Further, where the list is a doubly linked list annular, OrderedDictusing a sentinel element rootas a linked list head and tail :

   self.__hardroot = _Link()
   self.__root = root = _proxy(self.__hardroot)
    root.prev = root.next = root

  From the __setitem__apparent to OrderedDictadding a new value in the list becomes the following ring structure:

         next             next             next
   root <----> new node1 <----> new node2 <----> root
         prev             prev             prev

 root.nextFor the first node of the list, root.prevas the last node of the list.

  Because OrderedDictinherited from dictkey value pairs are stored in OrderedDicta linked list node itself is only saved key, not saved value.

  If we are to realize their own words without so complex, can be valueplaced in the node, the list only needs to implement functions can be inserted into the forefront and last end nodes removed:

from _weakref import proxy as _proxy


class Node:
    __slots__ = ('prev', 'next', 'key', 'value', '__weakref__')


class LRUCache:

    def __init__(self, capacity: int):
        self.__hardroot = Node()
        self.__root = root = _proxy(self.__hardroot)
        root.prev = root.next = root
        self.__map = {}
        self.capacity = capacity
        
    def get(self, key: int) -> int:
        if key in self.__map:
            self.move_to_head(key)
            return self.__map[key].value
        else:
            return -1
         
    def put(self, key: int, value: int) -> None:
        if key in self.__map:
            node = self.__map[key]
            node.value = value
            self.move_to_head(key)
        else:
            node = Node ()
            node.key = key
            node.value = value
            self.__map[key] = node
            self.add_head(node)
            if len(self.__map) > self.capacity:
                self.rm_tail()
        
    def move_to_head(self, key: int) -> None:
        if key in self.__map:
            node = self.__map[key]
            node.prev.next = node.next
            node.next.prev = node.prev
            head = self.__root.next
            self.__root.next = node
            node.prev = self.__root
            node.next = head
            head.prev = node
    
    def add_head(self, node: Node) -> None:
        head = self.__root.next
        self.__root.next = node
        node.prev = self.__root
        node.next = head
        head.prev = node
    
    def rm_tail(self) -> None:
        tail = self.__root.prev
        del self.__map[tail.key]
        tail.prev.next = self.__root
        self.__root.prev = tail.prev

node-lru-cache

In practical applications, to implement LRUcaching algorithms, but also to achieve a lot of extra features.

With a javascriptgood implementation of node-lru-cache package:

var LRU = require("lru-cache")
  , options = { max: 500
              , length: function (n, key) { return n * 2 + key.length }
              , dispose: function (key, n) { n.close() }
              , maxAge: 1000 * 60 * 60 }
  , cache = new LRU(options)
  , otherCache = new LRU(50) // sets just the max size

cache.set("key", "value")
cache.get("key") // "value"

这个包不是用缓存key的数量来判断是否要启动LRU淘汰算法,而是使用保存的键值对的实际大小来判断。选项options中可以设置缓存所占空间的上限max,判断键值对所占空间的函数length,还可以设置键值对的过期时间maxAge等,有兴趣的可以看下。

参考链接

Guess you like

Origin www.cnblogs.com/linxiyue/p/10926944.html