JBoss series of 28: JBossCache Profile

2019 Unicorn Enterprises heavily recruiting engineers Python standard >>>

abstract

This article introduces JBossCache, mainly from the original creator of the project JBossCache Bela Ban , head of the project and JBossCache Manik Surtani introduce the perspective of JBossCache. JBossCache Community Links http://www.jboss.org/jbosscache

JBoss Cache Profile

JBoss Cache is a distributed enterprise application cache, its purpose is to provide a distributed enterprise applications and cluster solutions that require Java objects frequently accessed by caching to improve application availability and significantly enhance the overall performance of the application. JBoss Cache is a separate product, you can use alone or register to deploy middleware platform as a service to use. JBoss Cache is a Java class library, you can also be extended to integrate it into your application use. A typical example is the use of JBoss Cache in JBoss, JBoss EJB in the cluster, JMS clustering, Web Application Clusters, JNDI cluster are all done by JBoss Cache.

JBoss Cache is a replicated transactional cache. Because multiple instances may be distributed to run JBoss Cache (JVM may be the same or different in the same host or on different hosts) and the data is replicated in the entire group, so called replicated cache. Because users can configure JTA compliant transaction manager and transactional cache operations, so it is transactional cache. Please note, this cache may not make any copy, it is the local mode.

JBoss Cache is an open source product, a developer and contributor to the active participation of the community. JBoss Cache comes in two versions: Core and POJO versions. Core library (using org.jboss.cache.Cache Interface) is organized in the tree structure in the data processing and locks, passivation, eviction and replication base library. POJO library (using org.jboss.cache.pojo.PojoCache interface) built on top of Core library that allows objects to allow introspection (Introspection and one) by JBoss AOP to provide a clear consistency. Bela Ban was originally built this project, and now it is up to the responsibility of the Manik Surtani. Here is Bela Ban and Manik Surtani interviews, to see how they locate and describe JBoss Cache.

Bela Ban Interview

Q: Can you tell us a little about yourself and your work?

A: I was JGroups (www.jgroups.org) projects and JBossCache (www.jboss.org) project leader. JGroups is a reliable multicast communications toolkit in Java, is the foundation of JBoss cluster and JBossCache. I was born in Switzerland, at the University of Zurich finished my doctorate. I also worked for four years at the IBM Zurich Research Institute. Then I moved to the United States, postdoctoral at Cornell University, then worked at Fujitsu until 2003. I joined the JBoss community in 2003.

Q: You just released JBossCache, please tell us a little of your product?

A: JBossCache tree is a copy of the transaction cache, can be used to replicate data across transactional process. So, if you add an element to a tree, all the trees it will appear in the cluster. Cache can be either local or cluster replication. In the case of cluster replication, we can do asynchronous or synchronous replication, which will block the caller until all cluster tree has been updated, but the former will send the request to modify the channel bus and return immediately. JBossCache can use the transaction, which means that if you have a transaction JBossCache all modifications can be collected only copy at the end when the last modifications are complete. Of course, if you roll back the transaction, there will be no duplication. So, to summarize I think JBossCache say there are three modes:

As a purely local cache, there is no copy
As a replicated cache, use asynchronous replication. You have a master server, which updates the tree and regularly update immediately or copied to all backup servers, but always in the background (without blocking the caller)
As a replicated cache, use synchronous replication. Here, we run a two-quadrant (2 phase) commit protocol all the trees in the cluster to ensure data consistency, in this case, we need to ensure that the changes have been applied to all the trees in the cluster. Of course, for reliability, in terms of performance, you have to pay a certain price.

Q: What made you want to develop a product into the cache?

A: Cache enterprise applications need to speed up, such as EJB, HTTP session replication.

Q: In any project, we have been guessing where performance bottlenecks, you think JBossCache bottleneck there?

A: First two quadrants (2 phase) submission is very expensive. Another bottleneck is, if you access the same object at different located in the same cluster, you may experience a deadlock. We solve this problem by using a lock acquisition timeouts, two such matters at the same time

Access the same object in which a transaction may be rolled back, while another may succeed. However, in the long run, we hope to come up with a program that provides a distributed deadlock detection mechanism.

Q: Can you tell us any bottlenecks that you encountered?

A: We did not really find any unexpected bottlenecks. One of our unit test failed due to timeout once, we realized that we did a 500 two-quadrant submission, which is the cause of the problem, of course, we use the tree structure of the cache will not have this problem.

Q: You will go to HP Labs did some performance testing. It will be an interesting experience, can you tell us a little about it anyway?

A: HP realized that many of their customers use JBoss, so they naturally want to be able to provide support for JBoss, and before you upgrade to JBoss community problem to solve performance problems themselves. So they are willing to provide performance test lab for us, we will test the performance of JGroups and JBossCache, but I am also interested in JBoss cluster performance testing.

Q: If you have the ability to change a design in Java What would you choose?

A: I am glad to learn using Java, I think Sun has done a good job, a few suggestions: NIO missing some important features, does not support multicast socket, SSL Secure Sockets, we need more primitive socket, so that Java applications have more choices.

Manik Surtani Interview

Q:?? Manik, Could you first talk with you in contact with your customers or understand, most people are how to use JBoss Cache cache which can bring advantages, especially in terms of the availability of highly, with buffer What progress will come?

A: From continuing storage, especially in the database to read data required costly. Moreover, the database also notorious (or not cheap) in scalability, when you want to extend or add more front-end client, this disadvantages apparently became obstacles. On the other hand, CPU, and memory prices getting cheaper, which means that more people can afford the cost of erection of high-availability systems. Suspended mode "site is down for maintenance" should have become history. Such as distributed caching JBoss Cache is playing a role in the middle layer between applications and database front-end services, providing quick access to persistent data state in memory. JBoss Cache can ensure a consistent state cache and state data in the database, to update the status of the data, and to ensure the JVM heap overflow problem does not arise.

Q: JBoss Cache and a number of other open source projects, such as how Hibernate and JBoss Seam integration and other circumstances?

A: Some open source projects actually used the JBoss Cache. Hibernate (and JBoss ApplicationServer implement the EJB3) read from the database backend to use JBoss Cache to store the entity data, so that you do not need when you call every entity connected to the database to find. I say this is purely a simple summary, Hibernate practical use of distributed cache is actually more complicated. Seam also to cache generated JSF page elements via a distributed cache, thereby improving scalability those pages or page elements generate a relatively slow speed of the site. There are also a number of open source projects, such as single sign-on Lucene, Hibernate Search, GridGain, HTTP Session clusters and clusters of JBoss Application Server (Single Sign-On) so the code uses JBoss Cache.

Q: JBoss Cache provides two caching methods: Core and POJO Cache Cache. Would you give us summarize both the main difference where?

A: The core cache will take you directly passed to its data is stored in a tree structure. Key / value pairs are stored in the node tree for replication or persistence of their needs are serialized. POJO caching mechanism is used more complex - to weave using bytecode introspection (introspecting) user class, user class fields to add a listener, once any change threshold, notifies listeners cache immediately. For example, if you want to store in a cache POJO large, complex objects, will cause the bytecode introspection POJO cache object, only the final storage of the object to the original domain tree structure. Once the threshold has been changed, the cached copy only changed the field value and not to copy an entire class of users, it is efficient fine-grained replication. Of course, there are some other differences, but the main difference is I just said.

Q: Fine-grained replication would certainly lead to huge differences in performance between POJO cache and core cache. You have no difference between the two assessments done it?

A: Such an assessment largely depends on the system configuration, if only to make a general assessment did not make much sense. With such a huge, complex objects in the cache when the fine-grained replication does help improve performance. But if you just use it to store some String, then there is nothing special fine-grained replication value. Similarly, the use of simple POJO Cache user object - say a Person class has only two String fields, not so much going to help performance, rather it is a waste of overhead.

Q: How do you manage referential integrity (referential integrity), especially POJO Cache?

A: If you mean the reference to an object, then you just point to the reason why the introduction of bytecode weaving. We added a POJO for the interceptor and insert a reference to the domain cache content.

Q: For users Why local cache instead of HashMap it?

A: Many people think that Map is considered the starting point of the cache (in fact, JSR-107 JCACHE ever-expanding group of experts on the basis of Map of realization javax.cache.Cache). Despite Map is well suited for storing simple key / value pairs in the cache on other properties required, it was somewhat running out of tricks, such as memory management (eviction), passivation (passivation) and persistent, fine-grained locking model (first , HashMap is not thread-safe; and lock ConcurrentHashMap uses a coarse-grained level, it does not even allow non-blocking multi-user data is read from the map) and so on. As for the "qualified" cache, there also needs to have some "enterprise" features, including JTA compatible, additional features such as a listener. Although the Map is a good starting point, but if you need to implement or manage those features I just mentioned, then select Map cache or to be more than appropriate.

Q: What kind of locking mechanism uses a distributed cache and traditional database uses the same mechanism to do??

A: JBoss Cache traditional pessimistic lock (pessimistic locking) way, each node in the tree corresponds to a lock. Isolation level and the same database isolation level locks embodiment allows multiple users to read the data. We also offer optimistic locking (optimistically lock) mode, this approach involves the data release, a copy of each maintenance transaction, a copy of the transaction confirmation, etc. The main tree structure submitted. In the optimistic locking mode, you need to carry large amounts of data read requests in the system can be obtained high parallelism. Those ones

Requesting user will not read data concurrent database write operation is blocked. Moreover, optimistic locking manner also avoid pessimistic locking in a deadlock may occur.

We carry multi-version concurrency control (Multi Versioned Concurrency Control - MVCC) features of JBoss Cache 3.0.0 is publishing stage, the current development task is very heavy. Most database systems are used in a multi-version concurrency control locking this way, it provides the best optimistic and pessimistic locking lock for us. Because we do not impede the realization of any user to read the data, so the data access speed than the former also win a hundred times. After MVCC function is relatively stable, we hope to be able to set it as the default JBoss Cache locking mechanism.

Q: Can you talk about JGroups integration?

A: JBoss Cache as with JGroups group communication library, team members and used to detect the formation of clusters. We also put JGroups as a channel, on which we have implemented a set of RPC mechanism to communicate in other caches. Since the application of JGoups, JBoss Cache obtain a high degree of flexibility, and also in highly scalable network protocols and adjustments. JBoss Cache Cache therefore also makes it possible to get rid of the box LAN clusters, able to penetrate the firewall restrictions and set up WAN clusters and so on.

Q: Can you use the cache separate from JBoss AS you?

A:! Of course be a lot of people mistakenly believe that JBoss Cache necessarily have to be used under JBoss App Server, it is not. JBoss Cache can be used in standalone Java program can be, but also used in some other GUI front-end application server in use. We just put it bundled JBoss App Server to publish it.

Q: The key failure of the transfer is to copy data to multiple nodes, there are many strategies available to copy data in the actual development. JBoss Cache support replication of what kind of model?

A: We currently support two modes - Global Copy (total replication - TR) and buddy replication (buddy replication - BR). Global Copy Copy status to all members of the team. This approach can help to share data among member states, to ensure that at the time of failover can be transferred to any member of the group, but it limits the scalability of the system. Buddy copy the selected particular member to take responsibility to back up data, copy data corresponding to the state of only these particular node. That is transferred directly to the replicated node failure transfer efficiency is very high, but even moved to any non-replicating node failover are also smoothly, because status data is transferred to the respective nodes upon request. BR best case for the session are closely related (session affinity), as the cost of state data can be high, so you should try to just call it when the failover occurs.

Q: certain framework, peer node replication will affect the scalability of the system. JBoss Cache in the presence of a similar problem?

A: No. P2P networks and communications group in the use of LAN and IP multicasting is highly efficient, highly scalable. Most modern network infrastructure support IP multicasting. However, P2P data replication, since each node has a data state of the system, the system scalability affected. I have the following global replication little comment. Based on the aforementioned reasons, we recommend that you use is closely related to the session's buddy replication.

Q: In terms of caching and clustering, you have expectations of how recent developments JBoss Cache will be how to meet the new needs of users??

A: With the hardware is getting cheaper, CPU manufacturers to place more and more cores on each chip, distributed cache will become increasingly important. This undoubtedly means more "virtual" machines, means that the database need to "make every effort" to manage the high degree of concurrency, distributed cache also means that data will become one of the bottlenecks (data bottleneck) the most important solution . Increasingly popular data grid and cloud computing will also promote the development of distributed cache, whether it is "cloud" or data grid nodes need to access and share data.

Reproduced in: https: //my.oschina.net/iwuyang/blog/197230