open source tool fit-cache

I. Introduction

        github address: https://github.com/SongTing0711/fit-cache

        At present, the use of local cache, Redis, Es, Ck and other caches in the industry is in a relatively casual state. On the one hand, it is a waste of resources. On the other hand, if it is a local cache, it may be the last straw that leads to oom.

        The fit-cache created by the author provides whether the key is suitable for caching, the suitable time for caching, and solves problems such as cache abuse and simultaneous cache failure.

2. Architecture

1. Tool positioning

        fit-cahe needs to implement two functions: whether the key is suitable for caching, and if it is suitable for caching, how long should it be cached.

        The author refers to Mysql's Innodb storage engine (innodb uses a percentage of LRU for mysql's buffer pool to avoid frequent disk reads and writes) and JD.com's HotKey (JD.com uses it to cache hotspot data in the mall in a timely manner to prevent redis from being blown up under big promotions. ).       

        The author believes that the sliding window + LRU list can be used for calculation, that is, the frequency and the most recent access time.

5fd46ff9604e42429bc68dbe57863d44.png

2. Realize 

        If it is realized, the author believes that the weight of the frequency is higher than the most recent access, so whether it is suitable for caching should first judge whether the sliding window corresponding to the key meets the set heat rule, if not, then look at the LRU list, but the length of the LRU needs to be limited , otherwise it is not very likely to be revisited anytime soon.

3ed866f94e37406d97ddddd7dd836184.png

        The time suitable for caching needs to be weighted by frequency and recent access time, and weighted calculation is performed. The recent access time also needs to be exponentially decayed, because the longer the access time is, the lower the priority is obviously. 

c0902d460dbf47ad87bd36c8a7dfef93.png

3. Packaging 

        There are no more than two ways to package open source tools. One is to package it into an independent Client and put it in maven for service reference. This is for the latitude of a single service. The advantage is that it is easy to introduce when using it, and it does not need to be like many middleware tools. Dividing the client and server leads to huge introduction costs.

        The other is to encapsulate it into a client-server mode, so that the cluster latitude can be calculated, so as to judge whether it is suitable for caching as a whole, and the corresponding caching time.

        The author currently encapsulates the first type, and the second type is still in the packaging.

3. Source code

        Here is the core code of fit-cache

1、core        

        The core API exposed by FitCacheStore: whether it is suitable for caching and suitable for caching time

        The value of the key suitable for caching can be stored in set, and the bottom layer of the author is caffeine

         DispatcherConfig is a blocking queue, where key access events are stored, and then taken out by asynchronous threads for calculation frequency and recent access

public class FitCacheStore {

    /**
     * 判断是否适合缓存
     */
    public static boolean isFitCache(String key) {
        try {
            // 先看滑动窗口的热度,判断适不适合缓存
            boolean fit = CaffeineCacheHolder.getFitCache().getIfPresent(key) != null;
            if (!fit) {
                fit = CaffeineCacheHolder.getLruCache().get(key) != null;
            }
            DispatcherConfig.QUEUE.put(key);
            return fit;
        } catch (Exception e) {
            return false;
        }
    }


    public static int fitCacheTime(String key) {
        try {
            SlidingWindow window = (SlidingWindow) CaffeineCacheHolder.getWindowCache().getIfPresent(key);
            long lastTime = (long) CaffeineCacheHolder.getLruCache().get(key);
            if (window == null && lastTime == 0) {
                return 0;
            }
            if (window == null && lastTime != 0) {
                return FitCacheTime.calculateStorageTime(0, lastTime);
            }
            if (window != null && lastTime == 0) {
                return FitCacheTime.calculateStorageTime(window.getCount(), 0);
            }
            int res = FitCacheTime.calculateStorageTime(window.getCount(), lastTime);
            DispatcherConfig.QUEUE.put(key);
            return res;
        } catch (Exception e) {
            return 0;
        }
    }

    /**
     * 从本地caffeine取值
     */
    public static Object get(String key) {
        return CaffeineCacheHolder.getFitCache().getIfPresent(key);
    }

    /**
     * 设置缓存
     */
    public static boolean set(String key, Object value) {
        Object object = CaffeineCacheHolder.getFitCache().getIfPresent(key);
        Object lru = CaffeineCacheHolder.getLruCache().get(key);
        if (object == null && lru == null) {
            return false;
        }
        CaffeineCacheHolder.getFitCache().put(key, value);
        return true;
    }
//
//    private static ExecutorService threadPoolExecutor = new ThreadPoolExecutor(1,
//            2,
//            5,
//            TimeUnit.SECONDS,
//            new ArrayBlockingQueue<>(100),
//            new ThreadPoolExecutor.DiscardOldestPolicy());
//    public static void main (String[] args) throws InterruptedException {
//        KeyRule rule = new KeyRule("test", true, 2,5);
//        KeyRuleHolder.KEY_RULES.add(rule);
//        IKeyListener iKeyListener = new KeyListener();
//        KeyConsumer keyConsumer = new KeyConsumer();
//        keyConsumer.setKeyListener(iKeyListener);
//
//        threadPoolExecutor.submit(keyConsumer::beginConsume);
//        boolean fit = isFitCache("test");
//        System.out.println("第一次访问test是否适合" + fit);
//        Thread.sleep(1000);
//        fit = isFitCache("test");
//        System.out.println("第2次访问test是否适合" + fit);
//        Thread.sleep(1000);
//        fit = isFitCache("test666");
//        System.out.println("第一次访问test666是否适合" + fit);
//        Thread.sleep(1000);
//        fit = isFitCache("test");
//        System.out.println("第3次访问test是否适合" + fit);
//        Thread.sleep(1000);
//        int time = fitCacheTime("test");
//        System.out.println("第1次访问test适合时间" + time);
//    }

}

2、config

        This is mainly to do a distribution function, put the thread that consumes the event into the thread pool to start

@Configuration
public class DispatcherConfig {

    private ExecutorService threadPoolExecutor = new ThreadPoolExecutor(1,
            2,
            5,
            TimeUnit.SECONDS,
            new ArrayBlockingQueue<>(100),
            new ThreadPoolExecutor.DiscardOldestPolicy());

    /**
     * 队列
     */
    public static BlockingQueue<String> QUEUE = new LinkedBlockingQueue<>(200);

    @Bean
    public Consumer consumer() {
        List<KeyConsumer> consumerList = new ArrayList<>();
        KeyConsumer keyConsumer = new KeyConsumer();
        consumerList.add(keyConsumer);

        threadPoolExecutor.submit(keyConsumer::beginConsume);
        return new Consumer(consumerList);
    }
}

public class KeyConsumer {

    @Resource
    private IKeyListener iKeyListener;

    public void setKeyListener(IKeyListener iKeyListener) {
        this.iKeyListener = iKeyListener;
    }

    public void beginConsume() {
        while (true) {
            try {
                String key = DispatcherConfig.QUEUE.take();
                iKeyListener.newKey(key);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

3、listen

        Consume events, walk through the sliding window and LRU list

@Component
public class KeyListener implements IKeyListener {

    @Override
    public void newKey(String key) {
        SlidingWindow slidingWindow = checkWindow(key);
        // 被访问,进入最近访问列表
        CaffeineCacheHolder.getLruCache().put(key, System.currentTimeMillis());
        //看看达到匹配规则没有
        boolean fit = slidingWindow.addCount(1);

        CaffeineCacheHolder.getWindowCache().put(key, slidingWindow);
        if (fit && CaffeineCacheHolder.getFitCache().getIfPresent(key) == null) {
            //数据变热,适合缓存
            CaffeineCacheHolder.getFitCache().put(key, System.currentTimeMillis());
        }
    }

    /**
     * 生成或返回该key的滑窗
     */
    private SlidingWindow checkWindow(String key) {
        // 取该key的滑窗
        return (SlidingWindow) CaffeineCacheHolder.getWindowCache().get(key, (Function<String, SlidingWindow>) s -> {
            // 是个新key,获取它的规则
            KeyRule keyRule = KeyRuleHolder.findRule(key);
            return new SlidingWindow(keyRule.getInterval(), keyRule.getThreshold());
        });
    }

}

4. Tools

        The first is the LRU list. Here, a two-way linked list is used to store the access time of the key. The list capacity can be set. The lru of innodb of mysql is special. It sets a ratio. It will not be placed in the head when it is accessed for the first time, but put About 30% if, because its data access is very large, not the key dimension. If the full table scan is easy to remove all the original contents of the lru list.

        The author does not need to be like him here, because it is key-dimensional.

public class LruCache {
    class Node {
        String key;
        Object value;
        Node prev;
        Node next;

        public Node(String key, Object value) {
            this.key = key;
            this.value = value;
        }
    }
    private final int capacity;
    private final Map<String, Node> cache;
    private Node head;
    private Node tail;

    public LruCache(int capacity) {
        this.capacity = capacity;
        this.cache = new HashMap<>();
        this.head = new Node("head", 0);
        this.tail = new Node("tail", 0);
        head.next = tail;
        tail.prev = head;
    }

    public Object get(String key) {
        Node node = cache.get(key);
        if (node != null) {
            // l列表里面有就转到头部
            moveToHead(node);
            return node.value;
        }
        return null;
    }

    public synchronized void put(String key, Object value) {
        Node node = cache.get(key);
        if (node != null) {
            node.value = value;
            moveToHead(node);
        } else {
            node = new Node(key, value);
            cache.put(key, node);
            addToHead(node);
            if (cache.size() > capacity) {
                // 超过容量就删除尾部节点
                Node removedNode = removeTail();
                cache.remove(removedNode.key);
            }
        }
    }

    private synchronized void moveToHead(Node node) {
        removeNode(node);
        addToHead(node);
    }

    private synchronized void addToHead(Node node) {
        node.prev = head;
        node.next = head.next;
        head.next.prev = node;
        head.next = node;
    }

    private synchronized void removeNode(Node node) {
        node.prev.next = node.next;
        node.next.prev = node.prev;
    }

    private synchronized Node removeTail() {
        Node removedNode = tail.prev;
        removeNode(removedNode);
        return removedNode;
    }

}

        Then there is the sliding window, which manages the frequency

public class SlidingWindow {
    /**
     * 循环队列,就是装多个窗口用,该数量是windowSize的2倍
     */
    private AtomicLong[] timeSlices;
    /**
     * 队列的总长度
     */
    private int timeSliceSize;
    /**
     * 每个时间片的时长,以毫秒为单位
     */
    private int timeMillisPerSlice;
    /**
     * 共有多少个时间片(即窗口长度)
     */
    private int windowSize;
    /**
     * 在一个完整窗口期内允许通过的最大阈值
     */
    private int threshold;
    /**
     * 该滑窗的起始创建时间,也就是第一个数据
     */
    private long beginTimestamp;
    /**
     * 最后一个数据的时间戳
     */
    private long lastAddTimestamp;


    public SlidingWindow(int duration, int threshold) {
        //超过10分钟的按10分钟
        if (duration > 600) {
            duration = 600;
        }
        //要求5秒内探测出来的,
        if (duration <= 5) {
            this.windowSize = 5;
            this.timeMillisPerSlice = duration * 200;
        } else {
            this.windowSize = 10;
            this.timeMillisPerSlice = duration * 100;
        }
        this.threshold = threshold;
        // 保证存储在至少两个window
        this.timeSliceSize = windowSize * 2;

        reset();
    }

    public SlidingWindow(int timeMillisPerSlice, int windowSize, int threshold) {
        this.timeMillisPerSlice = timeMillisPerSlice;
        this.windowSize = windowSize;
        this.threshold = threshold;
        // 保证存储在至少两个window
        this.timeSliceSize = windowSize * 2;

        reset();
    }

    /**
     * 初始化
     */
    private void reset() {
        beginTimestamp = System.currentTimeMillis();
        //窗口个数
        AtomicLong[] localTimeSlices = new AtomicLong[timeSliceSize];
        for (int i = 0; i < timeSliceSize; i++) {
            localTimeSlices[i] = new AtomicLong(0);
        }
        timeSlices = localTimeSlices;
    }

    /**
     * 计算当前所在的时间片的位置
     */
    private int locationIndex() {
        long now = System.currentTimeMillis();
        //如果当前的key已经超出一整个时间片了,那么就直接初始化就行了,不用去计算了
        if (now - lastAddTimestamp > timeMillisPerSlice * windowSize) {
            reset();
        }

        int index = (int) (((now - beginTimestamp) / timeMillisPerSlice) % timeSliceSize);
        if (index < 0) {
            return 0;
        }
        return index;
    }

    /**
     * 增加count个数量
     */
    public synchronized boolean addCount(int count) {
        //当前自己所在的位置,是哪个小时间窗
        int index = locationIndex();
        clearFromIndex(index);

        int sum = 0;
        // 在当前时间片里继续+1
        sum += timeSlices[index].addAndGet(count);
        for (int i = 1; i < windowSize; i++) {
            sum += timeSlices[(index - i + timeSliceSize) % timeSliceSize].get();
        }

        lastAddTimestamp = System.currentTimeMillis();
        return sum >= threshold;
    }


    public int getCount() {
        int sum = 0;
        //加上前面几个时间片
        for (int i = 1; i < windowSize; i++) {
            sum += timeSlices[i].get();
        }
        return sum;
    }

    private void clearFromIndex(int index) {
        for (int i = 1; i <= windowSize; i++) {
            int j = index + i;
            if (j >= windowSize * 2) {
                j -= windowSize * 2;
            }
            timeSlices[j].set(0);
        }
    }

}

        Then there is the calculation suitable for caching time, which is calculated according to frequency and recent access time

public class FitCacheTime {

    /**
     * 加权递减求和算法,计算数据的评分
     *
     * @param frequency
     * @param lastTime
     * @return
     */
    private static double calculateScore(double frequency, long lastTime) {
        // 根据业务需求和数据的重要性,给访问频率和最近访问时间分配不同的权重
        // 这里可以从配置中心拿
        double frequencyWeight = 0.7;
        double timeWeight = 0.3;
        // 计算访问频率和最近访问时间的值
        double time = (System.currentTimeMillis() - lastTime) / 1000.0;
        // 使用递减函数计算时间权重,越近访问的时间权重越高
        double timeDecay = Math.exp(-time);
        // 加权求和,得到评分
        double score = frequencyWeight * frequency + timeWeight * timeDecay;
        return score;
    }

    /**
     * 计算数据适合被存储的时间
     *
     * @param frequency
     * @param lastTime
     * @return
     */
    public static int calculateStorageTime(double frequency, long lastTime) {
        // 根据评分确定数据适合被存储的时间
        double score = calculateScore(frequency, lastTime);
        int storageTime = (int) Math.ceil(score);
        return storageTime;
    }

}

5. Cache

public class CaffeineCacheHolder {
    /**
     * key是appName,value是caffeine
     */
    private static final ConcurrentHashMap<String, Object> CACHE_MAP = new ConcurrentHashMap<>();

    private static final String FIT = "fit";
    private static final String WINDOW = "window";

    private static final String LRU = "lru";
    public static Cache<String, Object> getFitCache() {
        if (CACHE_MAP.get(FIT) == null) {
            // todo 这里要从配置中心拿
            CACHE_MAP.put(FIT, CaffeineBuilder.cache(60, 100, 60));
        }
        return (Cache<String, Object>) CACHE_MAP.get(FIT);
    }

    public static Cache<String, Object> getWindowCache() {
        if (CACHE_MAP.get(WINDOW) == null) {
            // todo 这里要从配置中心拿
            CACHE_MAP.put(WINDOW, CaffeineBuilder.cache(60, 100, 60));
        }
        return (Cache<String, Object>) CACHE_MAP.get(WINDOW);
    }

    public static LruCache getLruCache() {
        if (CACHE_MAP.get(LRU) == null) {
            // todo 这里要从配置中心拿
            CACHE_MAP.put(LRU, new LruCache(1));
        }
        return (LruCache) CACHE_MAP.get(LRU);
    }

}
public class CaffeineBuilder {

    /**
     * 构建所有来的要缓存的key getCache
     */
    public static Cache<String, Object> cache(int minSize, int maxSize, int expireSeconds) {
        return Caffeine.newBuilder()
                .initialCapacity(minSize)
                .maximumSize(maxSize)
                .expireAfterWrite(expireSeconds, TimeUnit.SECONDS)
                .build();
    }

}

Four. Summary

        fit-cache currently realizes whether the client is suitable for caching, suitable for caching time, and there are many ideas for expanding functions, such as cache invalidation, and when data is changed, the old data is displayed to the user; such as the priority sequence of the cache, low priority The level cache encroaches on the space of the high priority cache and so on. The author has some plans that need to be practiced.

        At present, the author is still creating and perfecting, and interested students are welcome to join.

 

Guess you like

Origin blog.csdn.net/m0_69270256/article/details/131844517