How to implement caching technology in java

1. Why does the cache exist?
2. Where can the cache exist?
3. What are the properties of the cache?
4. The cache medium? If you

understand these 4 problems, then we can freely judge which cache to use based on the application scenario.


1 . Why does the cache exist?
In general, a website or an application, its general form is that the browser requests the application server, the application server does a bunch of calculations and then requests the database, and the database receives the request and then performs a bunch of calculations. After that, the data is returned to the application server, and the application server does a bunch of calculations and then returns the data to the browser. This is a standard process. But with the popularity of the Internet, more and more people are surfing the Internet, and the amount of information on the Internet is also increasing. More and more, in these two more and more cases, our application needs to support more and more concurrency. Then our application server and database server do more and more calculations, but often Our application server resources are limited, and the number of requests per second in the database is also limited (who told us that our hard disk speed is limited). If we use limited resources to provide the greatest possible throughput, one way: Reduce the amount of calculation, shorten the request process (reduce network io or hard disk io), then the cache can make a big difference. The basic principle of the cache is to break the standard process depicted in the figure above. In this standard process, any link can be cut off. The request can get the data from the cache and return it directly. This not only saves time, improves the response speed, but also saves hardware resources. It allows us to serve more users with limited hardware resources.

2 Cache Where can it exist?

Java code

Browser---?between browser and app---?layered app-?database

Browser---?between browser and app---?layered app-?database

 

In the picture above, we can see the general flow of a request, let's redraw this picture and make our structure a little more complicated.

(layering the app)

Browser---?between browser and app---?layered app-?database

 

 

In theory, any link of the request is where the cache can work. The first link, the browser, if the data is stored in the browser, the speed is the fastest for the user, because there is no need for a network request at this time. . The second link, between the browser and the app, if the cache is added in this place, then the cache is transparent to the app. And this cache stores the complete page. The third node, the app itself There are several levels, so the cache can also be placed at different levels. This part is the part where the situation or scenario is more complicated. You need to be careful when choosing a cache. In the fourth link, there can also be a cache in the database, such as mysql's querycache .

 

So that means at any point in the entire request process, we can add cache. But can all data be put into the cache? Of course not, the data that needs to be put into the cache always has some characteristics, be clear To determine whether data can be cached, and how it can be cached, we must start with the changing characteristics of the data.

 

What are the changing characteristics of data? The simplest are two, change and change. We all know that data that will not change does not need to be calculated every time. The question is whether all data will theoretically change, and change is the world The eternal theme. That is to say, it is wrong for us to divide the data into two kinds: change and change, then let us add another condition: time. Then we can summarize the data characteristics as changing or constant over a period of time. Then according to this data feature, we can cache the data in the appropriate location and in the appropriate cache type.

 

3. What properties does the cache have?

From an object-oriented point of view, a cache is an object, so it is an object, and it must have attributes. Then let's discuss what attributes the cache has. The following lists three attributes that we commonly use.

(1) hit rate

The hit rate refers to the ratio of the number of times the cache is requested to the number of times the cache returns the correct result. The higher the ratio, the higher the usage rate of the cache.

 

The hit rate problem is a very important problem in the cache. We all hope that the hit rate of our cache can reach 100%, but it often backfires, and the cache hit rate is an important indicator to measure the effectiveness of the cache.

 

(2) Largest element

The maximum number of elements that can be stored in the cache. Once the number of elements in the cache exceeds this value, the cache clearing strategy will be activated. Reasonably setting the maximum element value according to different scenarios can often improve the cache hit rate to a certain extent. Cached when valid.

 

(3) Clearing strategy

 

1 FIFO ,first in first out ,最先进入缓存得数据在缓存空间不够情况下(超出最大元素限制时)会被首先清理出去

2 LFU , Less Frequently Used ,一直以来最少被使用的元素会被被清理掉。这就要求缓存的元素有一个hit 属性,在缓存空间不够得情况下,hit 值最小的将会被清出缓存。

2 LRU ,Least Recently Used ,最近最少使用的,缓存的元素有一个时间戳,当缓存容量满了,而又需要腾出地方来缓存新的元素的时候,那么现有缓存元素中时间戳离当前时间最远的元素将被清出缓存。

 

4.缓存介质

从硬件介质上来将无非就是两种,内存和硬盘(对应应用层的程序来讲不用考虑寄存器等问题).但是往往我们不会从硬件上来划分,一般的划分方法是从技术上划分,可以分成几种,内存,硬盘文件.数据库.

(1) 内存.将缓存放在内存中是最快的选择,任何程序直接操作内存都比操作硬盘要快的多,但是如果你的数据要考虑到break down的问题,因为放在内存中的数据我们称之为没有持久话的数据,如果硬盘上没有备份,机器down机之后,很难或者无法恢复.

 

(2) 硬盘.一般来说,很多缓存框架会结合使用内存和硬盘,比如给内存分配的空间有满了之后,会让用户选择把需要退出内存空间的数据持久化到硬盘.当然也选择直 接把数据放一份到硬盘(内存中一份,硬盘中一份,down机也不怕).也有其他的缓存是直接把数据放到硬盘上.

 

 

(3) 数据库.说到数据库,可能有的人会想,之前不是讲到要减少数据库查询的次数,减少数据库计算的压力吗,现在怎么又用数据库作为缓存的介质了呢.这是因为数 据库又很多种类型,比如berkleydb,这种db不支持sql语句,没有sql引擎,只是key和value的存储结构,所以速度非常的快,在当代一 般的pc上,每秒中十几w次查询都是没有问题的(当然这个是根据业务特征来决定的,如果您访问的数据在分布上是均匀的,那ahuaxuan可不能保证这个 速度了).

(Redis是一个开源的使用ANSI C语言编写、支持网络、可基于内存亦可持久化的日志型、Key-Value数据库,并提供多种语言的API)

 

 

 

除了缓存介质之外,ahuaxuan根据缓存和应用的耦合程度将其划分为local cache和remote cache.

Local cache是指包含在应用之中的缓存组件.而remote cache指和应用解耦在应用之外的缓存组件.典型的local cache有ehcache,oscache,而remote cache有大名鼎鼎的memcached.

 

Localcache 最大的优点是应用和cache的时候是在同一个进程内部,请求缓存非常快速,完全不需要网络开销等.所以单应用,不需要集群或者集群情况下cache node不需要相互通知的情况下使用local cache比较合适.这也是java中ehcache和oscache这么流行的原因.

但是 Local cache是有一定的缺点的,一般这种缓存框架(比如java中的ehcache或者oscache)都是local cache.也就是跟着应用程序走的,多个应用程序无法直接共享缓存,应用集群的情况下这个问题更加明显,当然也有的缓存组件提供了集群节点相互通知缓存 更新的功能,但是由于这个是广播,或者是环路更新,在缓存更新频繁的情况下会导致网络io开销非常大,严重的时候会影响应用的正常运行.而且如果缓存中数 据量较大得情况下使用localcache意味着每个应用都有一份这么大得缓存,着绝对是对内存的浪费.

 

所以这个情况下,往往我们会 选择remote cache,比如memcached.这样集群或者分布式的情况下各个应用都可以共享memcached中的数据,这些应用都通过socket和基于 tcp/ip协议上层的memcached协议直接连接到memcached,有一个app更新了memcached中的值,所有的应用都能拿到最新的 值.虽然这个时候多了很多了网络上的开销,但是往往这种方案要比localcache广播或环路更新cache节点要普遍的多,而且性能也比后者高.由于 数据只需要保存一份,所以也提高了内存的使用率.

 

通过以上分析可以看出,不管是local cache,还是remote cache在缓存领域都有自己的一席之地,所以ahuaxuan建议在选择或者使用缓存时一定要根据缓存的特征和我们的业务场景准确判断使用何种缓存.这样才能充分发挥缓存的功能.

 

Ahuaxuan 认为,缓存的使用是架构师的必备技能,好的架构师能够根据数据的类型,业务的场景来准确的判断出使用何种类型的缓存,并且如何使用这种类型的缓存.在缓存 的世界里也没有银弹,目前还没有一种缓存可以解决任何的业务场景或者数据类型,如果这种技术出现了,那架构师就又更不值钱了.呵呵.

OSCache
  

  OSCache是个一个广泛采用的高性能的J2EE缓存框架,OSCache能用于任何Java应用程序的普通的缓存解决方案。
  
  OSCache有以下特点:
  
  缓存任何对象,你可以不受限制的缓存部分jsp页面或HTTP请求,任何java对象都可以缓存。
  
  拥有全面的API--OSCache API给你全面的程序来控制所有的OSCache特性。
  
  永久缓存--缓存能随意的写入硬盘,因此允许昂贵的创建(expensive-to-create)数据来保持缓存,甚至能让应用重启。
  
  支持集群--集群缓存数据能被单个的进行参数配置,不需要修改代码。
  
  缓存记录的过期--你可以有最大限度的控制缓存对象的过期,包括可插入式的刷新策略(如果默认性能不需要时)。
  
  官方网站 http://www.opensymphony.com/oscache/
  
  Java Caching System
  
  JSC(Java Caching System)是一个用分布式的缓存系统,是基于服务器的java应用程序。它是通过提供管理各种动态缓存数据来加速动态web应用。
  
  JCS和其他缓存系统一样,也是一个用于高速读取,低速写入的应用程序。
  
  动态内容和报表系统能够获得更好的性能。
  
  如果一个网站,有重复的网站结构,使用间歇性更新方式的数据库(而不是连续不断的更新数据库),被重复搜索出相同结果的,就能够通过执行缓存方式改进其性能和伸缩性。
  
  官方网站 http://jakarta.apache.org/turbine/jcs/
  
  EHCache
  
  EHCache 是一个纯java的在进程中的缓存,它具有以下特性:快速,简单,为Hibernate2.1充当可插入的缓存,最小的依赖性,全面的文档和测试。
  
  官方网站 http://ehcache.sourceforge.net/
  
  JCache
  
  JCache是个开源程序,正在努力成为JSR-107开源规范,JSR-107规范已经很多年没改变了。这个版本仍然是构建在最初的功能定义上。
  
  官方网站 http://jcache.sourceforge.net/
  
  ShiftOne
  
  ShiftOne Java Object Cache是一个执行一系列严格的对象缓存策略的Java lib,就像一个轻量级的配置缓存工作状态的框架。
  
  官方网站 http://jocache.sourceforge.net/
  
  SwarmCache
  
  SwarmCache是一个简单且有效的分布式缓存,它使用IP multicast与同一个局域网的其他主机进行通讯,是特别为集群和数据驱动web应用程序而设计的。SwarmCache能够让典型的读操作大大超过写操作的这类应用提供更好的性能支持。
  
  SwarmCache使用JavaGroups来管理从属关系和分布式缓存的通讯。
  
  官方网站 http://swarmcache.sourceforge.net
  
  TreeCache / JBossCache
  
   JBossCache是一个复制的事务处理缓存,它允许你缓存企业级应用数据来更好的改善性能。缓存数据被自动复制,让你轻松进行JBoss服务器之间 的集群工作。JBossCache能够通过JBoss应用服务或其他J2EE容器来运行一个MBean服务,当然,它也能独立运行。
  
  JBossCache包括两个模块:TreeCache和TreeCacheAOP。
  
  TreeCache --是一个树形结构复制的事务处理缓存。
  
  TreeCacheAOP --是一个“面向对象”缓存,它使用AOP来动态管理POJO(Plain Old Java Objects)
  
  注:AOP是OOP的延续,是Aspect Oriented Programming的缩写,意思是面向方面编程。
  
  官方网站 http://www.jboss.org/products/jbosscache
  
  WhirlyCache
  
  Whirlycache是一个快速的、可配置的、存在于内存中的对象的缓

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326831149&siteId=291194637