Object Caching in a Web Portal Application Using JCSby Srini Penchikala
Recently, in the web portal application I have been working on, we had a requirement to store some lookup data (such as rate revision data, state and product lists) in the memory of a servlet container (Tomcat) so we wouldn't hit the back-end database every time we access the data. At the same time, we needed to refresh the data stored in the memory at predefined time intervals so it didn't become stale and inaccurate. We also needed a mechanism to refresh different types of data stored in the memory at different time intervals. For example, rate revision data had to be refreshed once every day, whereas the lookup data could stay in the memory for longer periods of time. Object caching was the perfect solution to accomplish all of these tasks in one shot.
The best definition of object caching can be found in the functional specification document for Object Caching Service for Java (OCS4J), which says:
A server must manage information and executable objects that fall into three basic categories: objects that never change, objects that are different with every request, and everything in between. Java is well equipped to handle the first two cases but offers little help for the third. If the object never changes, we create a static object when the server is initialized. If the object is unique to every request, we create a new object each time. For everything in between, objects or information that can change and are shared across requests, between users or between processes, there is the "Object Caching Service."
Basically, object caching is all about avoiding the expensive re-acquisition of objects by not releasing the objects immediately after their use. Instead, the objects are stored in memory and are reused for any subsequent client requests. When the data is retrieved from the data source for the first time, it is temporarily stored in a memory buffer called
cache. When the same data needs to be accessed again, the object is fetched from the cache instead of acquiring it again from the data source. Finally, the cached data is released from the memory when it's no longer needed. When a specific object can be released from memory is controlled by defining a reasonable expiration time after which data stored in the object becomes invalid, from web application standpoint.
Benefits of Caching
One of the main benefits of object caching is the significant improvement in application performance. In a multi-tiered application, data access is an expensive operation, compared to any other task. By keeping the frequently accessed data around and not releasing it after the first use, we can avoid the cost and time required for the reacquisition and release of the data. This results in greatly improved web application performance, since we won't be hitting the database every time we need the data.
Scalability is another benefit of using object caching. Since cached data is accessed across multiple sessions and web applications, object caching can become a big part of a scalable web application design. Object caching, just like object pooling, can help avoid the cost of acquiring and releasing the objects.
Liabilities of Caching
There are many advantages in using object caching in a web application, but there are a few disadvantages to caching the objects.
- Synchronization complexity: Depending on the kind of data, complexity increases, because consistency between the state of the cached data and the original data in the data source needs to be ensured. Otherwise, the cached data can get out of sync with the actual data, which will lead to data inaccuracies.
- Durability: Changes to the cached data can be lost when the server crashes. This problem can be avoided if a synchronized cache is used.
- Memory size: JVM memory size can get unacceptably huge if there is lot of unused data in the cache and the data is not released from memory at regular intervals.
Any data that does not change frequently and takes longer times to retrieve from the data source is a good candidate for caching. This includes pretty much all types of lookup data, code and description lists, and common search results with paging functionality (search results can be extracted from the data source once and stored in the cache to be used when user clicks on a paging link on the results screen).
Middleware technologies such as EJB and CORBA allow the remote transfer of objects where the remote object is transferred between the client and the server. This type of access, also known as coarse-grained data access, is done to minimize the number of expensive remote method invocations. These data transfer objects (also known as value objects) can be stored in the cache if the objects don't change very frequently. This minimizes the number of times servlet container needs to access the application server every time the client needs the value objects.
How about the types of data that are not a good fit for caching? Here's a list of the data that's not recommended to be cached:
- Secure information that can be accessed by other users on the web site.
- User profile information.
- Personal information, such as Social Security Number or credit card details.
- Business information that changes frequently and causes problems if not up to date and accurate.
- Session-specific data that may not be intended to be accessed by other users.
There are several object-caching frameworks (in both open source and commercial implementations) that provide distributed caching in servlet containers and application servers. Following is a list of some of the currently available caching frameworks:
- Java Caching System (JCS) from Jakarta (part of the Turbine project)
- Commons Collections (another Jakarta project)
- JCache API (SourceForge.net)
- SpiritCache (from SpiritSoft).
- Coherence (Tangosol)
- Javlin (eXcelon)
- Object Caching Service for Java (Oracle)
If you are interested in reading more about any of these caching implementations, the resources section at the end of this article has URL links to all of these frameworks.
I looked at Commons Collections and Java Caching System API when I first started my research for a caching framework that would address all of our caching requirements. My main criteria in the initial evaluation were ease of use, future extensibility, and software cost. Both of these frameworks are open source and available from the Apache Jakarta project.
The Commons Collections API provides the object caching mechanism in the form of a Java class called
LRUMap. If you have a basic requirement for caching and don't really need any advanced, configurable caching features, then Commons Collections may not be a bad choice, but it is not the answer if you are looking for an enterprise-level caching system. If you choose Common Collections to implement object caching, you will have to manually define/configure all of the caching parameters, such as the maximum number of objects that can be cached, the maximum life of a cached object, the idle time to refresh the cache or check for "stale" objects, etc.
Java Caching System
Java Caching System (JCS), part of the Jakarta Turbine project, is more sophisticated and extensive than Commons Collections. It's a highly flexible and configurable solution to increase overall system performance by maintaining dynamic pools of frequently used objects. As mentioned on its web site, JCS goes beyond simply caching objects in memory. It provides several important features necessary for an enterprise-level caching system:
- Memory management
- Disk overflow (and defragmentation)
- Element grouping
- Quick nested categorical removal
- Data expiration
- Extensible framework
- Fully configurable runtime parameters
- Remote synchronization
- Remote store recovery
- Optional lateral distribution of elements via HTTP, TCP, or UDP protocols
JCS provides a framework with no point of failure, allowing for full session failover (in clustered environments), including session data across up to 256 servers. It also provides the flexibility to configure one or more data storage options, such as memory cache, disk cache, or caching the data on a remote machine.
All of these useful features in JCS made it a perfect choice for the object-caching requirement in my web portal project.
Web Portal Caching Framework using JCS
Objectives of the Caching Framework
Before I started the design on my object-caching framework, I made a list of objectives that needed to be accomplished in the new framework. Following is the list of these objectives:
- Faster access to frequently used data in the web portal application.
- Grouping of similar type of objects in the cache.
- Configurable cache management so I can modify cache parameters declaratively rather than programmatically.
- Flexible and extensible framework so I can switch to any third-party caching API in the future.
- Ability to generate statistics to monitor the effectiveness of caching and application-performance improvement as a result of data caching.
Installation and Configuration
It is fairly simple to install and configure JCS in a web application. JCS can be installed by downloading the .zip file from the Jakarta Turbine web site, extracting the .zip file contents to a temp directory, and copying the JCS .jar file (
jcs-1.0-dev.jar) into the servlet container's common directory (I used Tomcat as the servlet container in my web application, where the common directory is
%TOMCAT_HOME%\common\lib on Windows or
$TOMCAT_HOME/common/lib on Unix-like systems). You will also need the commons-collections.jar, commons-lang.jar, and commons-logging.jar files in the web application's classpath to use JCS.
The main elements of the object caching framework are illustrated with the help of UML diagrams in Figures 1 and 2.
Pages: 1, 2