LRU cache replacement strategy and C# implementation

LRU cache replacement strategy

Caching is a very common design to improve data access speed by caching data in storage devices with faster access speeds, such as memory, CPU cache, hard disk cache, etc.

However, compared with the high speed of the cache, the cost of the cache is high, so the capacity is often limited. When the cache is full, a strategy is needed to decide which data to remove from the cache to make room for new storage. The data.

Such a policy is called a cache replacement policy (Cache Replacement Policy).

Common cache replacement strategies include: FIFO (First In First Out), LRU (Least Recently Used), LFU (Least Frequently Used), etc.

What I will introduce to you today is the LRU algorithm.

main idea

The LRU algorithm is based on the assumption that if data has been accessed recently, it has a higher chance of being accessed in the future.

In most cases, this assumption is true, so the LRU algorithm is also a commonly used cache replacement strategy.

Based on this assumption, when implementing, we need to maintain an ordered data structure to record the access history of the data. When the cache is full, we can decide which data to remove from the cache based on this data structure.

Not applicable

However, if the data access pattern does not meet the assumptions of the LRU algorithm, the LRU algorithm will fail.

For example, if the data access mode is periodic, then the LRU algorithm will eliminate the periodic data, which will lead to a decrease in the cache hit rate.

To put it another way, for example, if the current cached data is only accessed during the day, and another batch of data is accessed at night, then at night, the LRU algorithm will eliminate the data accessed during the day, and the data accessed last night will be deleted during the next day. The data is eliminated, which will lead to a decrease in the cache hit rate.

Later, I will introduce the LFU (Least Frequently Used) algorithm and the LFRU (Least Frequently and Recently Used) algorithm combining LFU and LRU, which can effectively solve this problem.

Algorithm implementation

As mentioned above, the LRU algorithm needs to maintain an ordered data structure to record the access history of the data. Usually we use a doubly linked list to implement this data structure, because a doubly linked list can insert data to the head or tail of the linked list in O(1) time complexity, and delete data in O(1) time complexity.

We store data in a two-way linked list, and move the data to the end of the linked list every time we access the data, so that we can ensure that the tail of the linked list is the most recently accessed data, and the head of the linked list is the data that has not been accessed for the longest time .

When the cache is full, if new data needs to be inserted, because the head of the linked list is the data that has not been accessed for the longest time, we can directly delete the head of the linked list, and then insert new data at the end of the linked list.

If we want to implement a cache of key-value pairs, we can use a hash table to store key-value pairs, so that the search operation can be completed in O(1) time complexity, and we can use Dictionary in .NET.

At the same time, we use LinkedList as the implementation of the doubly linked list, store the key of the cache, and record the access history of the data.

Every time we operate the Dictionary to insert, delete, and search, we need to insert, delete, and move the corresponding key to the end of the linked list.

// 实现 IEnumerable 接口,方便遍历
public class LRUCache<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
    private readonly LinkedList<TKey> _list;

    private readonly Dictionary<TKey, TValue> _dictionary;

    private readonly int _capacity;
    
    public LRUCache(int capacity)
    {
        _capacity = capacity;
        _list = new LinkedList<TKey>();
        _dictionary = new Dictionary<TKey, TValue>();
    }

    public TValue Get(TKey key)
    {
        if (_dictionary.TryGetValue(key, out var value))
        {
            // 在链表中删除 key,然后将 key 添加到链表的尾部
            // 这样就可以保证链表的尾部就是最近访问的数据,链表的头部就是最久没有被访问的数据
            // 但是在链表中删除 key 的时间复杂度是 O(n),所以这个算法的时间复杂度是 O(n)
            _list.Remove(key);
            _list.AddLast(key);
            return value;
        }

        return default;
    }

    public void Put(TKey key, TValue value)
    {
        if (_dictionary.TryGetValue(key, out _))
        {
            // 如果插入的 key 已经存在,将 key 对应的值更新,然后将 key 移动到链表的尾部
            _dictionary[key] = value;
            _list.Remove(key);
            _list.AddLast(key);
        }
        else
        {          
            if (_list.Count == _capacity)
            {
                // 缓存满了,删除链表的头部,也就是最久没有被访问的数据
                _dictionary.Remove(_list.First.Value);
                _list.RemoveFirst();
            }

            _list.AddLast(key);
            _dictionary.Add(key, value);
        }
    }

    public void Remove(TKey key)
    {
        if (_dictionary.TryGetValue(key, out _))
        {
            _dictionary.Remove(key);
            _list.Remove(key);
        }
    }

    public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
    {
        foreach (var key in _list)
        {
            yield return new KeyValuePair<TKey, TValue>(key, _dictionary[key]);
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}
var lruCache = new LRUCache<int, int>(4);

lruCache.Put(1, 1);
lruCache.Put(2, 2);
lruCache.Put(3, 3);
lruCache.Put(4, 4);

Console.WriteLine(string.Join(" ", lruCache));
Console.WriteLine(lruCache.Get(2));
Console.WriteLine(string.Join(" ", lruCache));
lruCache.Put(5, 5);
Console.WriteLine(string.Join(" ", lruCache));
lruCache.Remove(3);
Console.WriteLine(string.Join(" ", lruCache));

output:

[1, 1] [2, 2] [3, 3] [4, 4] // 初始化
2                           // 访问 2
[1, 1] [3, 3] [4, 4] [2, 2] // 2 移动到链表尾部
[3, 3] [4, 4] [2, 2] [5, 5] // 插入 5
[4, 4] [2, 2] [5, 5]        // 删除 3

algorithm optimization

In the above implementation, the query, insertion, and deletion of the cache will all involve the deletion of data in the linked list (moving is also deleting and inserting).

Because we store the key in the LinkedList, we need to find the corresponding node in the linked list through the key first, and then perform the deletion operation, which leads to the time complexity of the deletion operation of the linked list being O(n).

Although the time complexity of the dictionary's lookup, insertion, and deletion operations is O(1), because the time complexity of the linked list operation is O(n), the worst time complexity of the entire algorithm is O(n).

The key to algorithm optimization is how to reduce the time complexity of the delete operation of the linked list.

Optimization idea:

  1. Store the mapping relationship between key and nodes in LinkedList in Dictionary
  2. Store key-value in the node of LinkedList

That is, we make a connection between two otherwise unrelated data structures.

No matter when inserting, deleting, or looking up the cache, this connection can be used to reduce the time complexity to O(1).

  1. Find the corresponding node in the Dictionary through the key, and then take the value from the LinkedList node. The time complexity is O(1)
  2. Before LinkedList deletes data, it first finds the corresponding node in the Dictionary through the key, and then deletes, so that the time complexity of the delete operation of the linked list can be reduced to O(1)
  3. When LinkedList deletes the head node, because the key is stored in the node, we can delete the corresponding node in the Dictionary through the key, and the time complexity is O(1)
public class LRUCache_V2<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
    private readonly LinkedList<KeyValuePair<TKey, TValue>> _list;
    
    private readonly Dictionary<TKey, LinkedListNode<KeyValuePair<TKey, TValue>>> _dictionary;
    
    private readonly int _capacity;
    
    public LRUCache_V2(int capacity)
    {
        _capacity = capacity;
        _list = new LinkedList<KeyValuePair<TKey, TValue>>();
        _dictionary = new Dictionary<TKey, LinkedListNode<KeyValuePair<TKey, TValue>>>();
    }
    
    public TValue Get(TKey key)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            _list.Remove(node);
            _list.AddLast(node);
            return node.Value.Value;
        }
        
        return default;
    }
    
    public void Put(TKey key, TValue value)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            node.Value = new KeyValuePair<TKey, TValue>(key, value);
            _list.Remove(node);
            _list.AddLast(node);
        }
        else
        {
            if (_list.Count == _capacity)
            {
                _dictionary.Remove(_list.First.Value.Key);
                _list.RemoveFirst();
            }
            
            var newNode = new LinkedListNode<KeyValuePair<TKey, TValue>>(new KeyValuePair<TKey, TValue>(key, value));
            _list.AddLast(newNode);
            _dictionary.Add(key, newNode);
        }
    }
    
    public void Remove(TKey key)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            _dictionary.Remove(key);
            _list.Remove(node);
        }
    }

    public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
    {
        return _list.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

advanced optimization

Because our storage requirements for the doubly linked list are customized, and the key-value is required to be stored in the node. If we directly use the LinkedList of C#, we need to use a structure such as KeyValuePair for indirect storage, which will cause some unnecessary memory overhead.

We can implement a doubly linked list by ourselves, so that key-value can be stored directly in the node, thereby reducing memory overhead.

public class LRUCache_V3<TKey, TValue>
{
    private readonly DoubleLinkedListNode<TKey, TValue> _head;

    private readonly DoubleLinkedListNode<TKey, TValue> _tail;

    private readonly Dictionary<TKey, DoubleLinkedListNode<TKey, TValue>> _dictionary;

    private readonly int _capacity;

    public LRUCache_V3(int capacity)
    {
        _capacity = capacity;
        _head = new DoubleLinkedListNode<TKey, TValue>();
        _tail = new DoubleLinkedListNode<TKey, TValue>();
        _head.Next = _tail;
        _tail.Previous = _head;
        _dictionary = new Dictionary<TKey, DoubleLinkedListNode<TKey, TValue>>();
    }

    public TValue Get(TKey key)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            RemoveNode(node);
            AddLastNode(node);
            return node.Value;
        }

        return default;
    }

    public void Put(TKey key, TValue value)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            RemoveNode(node);
            AddLastNode(node);
            node.Value = value;
        }
        else
        {
            if (_dictionary.Count == _capacity)
            {
                var firstNode = RemoveFirstNode();

                _dictionary.Remove(firstNode.Key);
            }

            var newNode = new DoubleLinkedListNode<TKey, TValue>(key, value);
            AddLastNode(newNode);
            _dictionary.Add(key, newNode);
        }
    }

    public void Remove(TKey key)
    {
        if (_dictionary.TryGetValue(key, out var node))
        {
            _dictionary.Remove(key);
            RemoveNode(node);
        }
    }

    private void AddLastNode(DoubleLinkedListNode<TKey, TValue> node)
    {
        node.Previous = _tail.Previous;
        node.Next = _tail;
        _tail.Previous.Next = node;
        _tail.Previous = node;
    }

    private DoubleLinkedListNode<TKey, TValue> RemoveFirstNode()
    {
        var firstNode = _head.Next;
        _head.Next = firstNode.Next;
        firstNode.Next.Previous = _head;
        firstNode.Next = null;
        firstNode.Previous = null;
        return firstNode;
    }

    private void RemoveNode(DoubleLinkedListNode<TKey, TValue> node)
    {
        node.Previous.Next = node.Next;
        node.Next.Previous = node.Previous;
        node.Next = null;
        node.Previous = null;
    }
    
    internal class DoubleLinkedListNode<TKey, TValue>
    {    
        public DoubleLinkedListNode()
        {
        }

        public DoubleLinkedListNode(TKey key, TValue value)
        {
            Key = key;
            Value = value;
        }

        public TKey Key { get; set; }
        
        public TValue Value { get; set; }

        public DoubleLinkedListNode<TKey, TValue> Previous { get; set; }

        public DoubleLinkedListNode<TKey, TValue> Next { get; set; }
    }
}

Benchmark

Use BenchmarkDotNet to compare the performance of the three versions.

[MemoryDiagnoser]
public class WriteBenchmarks
{
    // 保证写入的数据有一定的重复性,借此来测试LRU的最差时间复杂度
    private const int Capacity = 1000;
    private const int DataSize = 10_0000;
    
    private List<int> _data;

    [GlobalSetup]
    public void Setup()
    {
        _data = new List<int>();
        var shared = Random.Shared;
        for (int i = 0; i < DataSize; i++)
        {
            _data.Add(shared.Next(0, DataSize / 10));
        }
    }
    
    [Benchmark]
    public void LRUCache_V1()
    {
        var cache = new LRUCache<int, int>(Capacity);
        foreach (var item in _data)
        {
            cache.Put(item, item);
        }
    }
    
    [Benchmark]
    public void LRUCache_V2()
    {
        var cache = new LRUCache_V2<int, int>(Capacity);
        foreach (var item in _data)
        {
            cache.Put(item, item);
        }
    }
    
    [Benchmark]
    public void LRUCache_V3()
    {
        var cache = new LRUCache_V3<int, int>(Capacity);
        foreach (var item in _data)
        {
            cache.Put(item, item);
        }
    }
}

public class ReadBenchmarks
{
    // 保证写入的数据有一定的重复性,借此来测试LRU的最差时间复杂度
    private const int Capacity = 1000;
    private const int DataSize = 10_0000;
    
    private List<int> _data;
    private LRUCache<int, int> _cacheV1;
    private LRUCache_V2<int, int> _cacheV2;
    private LRUCache_V3<int, int> _cacheV3;

    [GlobalSetup]
    public void Setup()
    {
        _cacheV1 = new LRUCache<int, int>(Capacity);
        _cacheV2 = new LRUCache_V2<int, int>(Capacity);
        _cacheV3 = new LRUCache_V3<int, int>(Capacity);
        _data = new List<int>();
        var shared = Random.Shared;
        for (int i = 0; i < DataSize; i++)
        {
            int dataToPut  = shared.Next(0, DataSize / 10);
            int dataToGet = shared.Next(0, DataSize / 10);
            _data.Add(dataToGet);
            _cacheV1.Put(dataToPut, dataToPut);
            _cacheV2.Put(dataToPut, dataToPut);
            _cacheV3.Put(dataToPut, dataToPut);
        }
    }
    
    [Benchmark]
    public void LRUCache_V1()
    {
        foreach (var item in _data)
        {
            _cacheV1.Get(item);
        }
    }
    
    [Benchmark]
    public void LRUCache_V2()
    {
        foreach (var item in _data)
        {
            _cacheV2.Get(item);
        }
    }
    
    [Benchmark]
    public void LRUCache_V3()
    {
        foreach (var item in _data)
        {
            _cacheV3.Get(item);
        }
    }
}

Write performance test results:

|      Method |      Mean |     Error |    StdDev |    Median |     Gen0 |     Gen1 | Allocated |
|------------ |----------:|----------:|----------:|----------:|---------:|---------:|----------:|
| LRUCache_V1 | 16.890 ms | 0.3344 ms | 0.8012 ms | 16.751 ms | 750.0000 | 218.7500 |   4.65 MB |
| LRUCache_V2 |  7.193 ms | 0.1395 ms | 0.3958 ms |  7.063 ms | 703.1250 | 226.5625 |   4.22 MB |
| LRUCache_V3 |  5.761 ms | 0.1102 ms | 0.1132 ms |  5.742 ms | 585.9375 | 187.5000 |   3.53 MB |

Query performance test results:

|      Method |      Mean |     Error |    StdDev |    Gen0 | Allocated |
|------------ |----------:|----------:|----------:|--------:|----------:|
| LRUCache_V1 | 19.475 ms | 0.3824 ms | 0.3390 ms | 62.5000 |  474462 B |
| LRUCache_V2 |  1.994 ms | 0.0273 ms | 0.0242 ms |       - |       4 B |
| LRUCache_V3 |  1.595 ms | 0.0187 ms | 0.0175 ms |       - |       3 B |

Welcome to pay attention to the blogger's technical public account: EventHorizonCLI

Introduce a low-key and powerful development tool: JNPF rapid development platform . It adopts the latest mainstream front-to-back separation framework (SpringBoot+Mybatis-plus+Ant-Design+Vue3). The code generator has low dependence, flexible expansion capability, and can flexibly realize secondary development.

In order to support application development with higher technical requirements, from database modeling, Web API construction to page design, there is almost no difference from traditional software development. Only through the low-code visualization mode, the repetitive labor of building the "addition, deletion, modification and query" function is reduced.

Guess you like

Origin blog.csdn.net/wangonik_l/article/details/131659091