EJB 2 Clustering with Application Serversby Tyler Jewell
The Enterprise JavaBeans (EJB) component model is becoming the component framework of choice for enterprise development among Java developers. The EJB specification goes to great lengths to define characteristics that give containers the ability to manage transactions, persistence, environment variables, resource connections, and other infrastructure services.
Missing from the specification, however, is any reference to techniques or standards for instrumenting EJBs as clustered objects. This aspect, which is required for any large-scale system, has been completely left to the application server vendors to instrument as a valued addition in their systems.
This article will provide a systematic breakdown of the different possibilities that application server vendors may incorporate into their systems to instrument clustered EJBs.
Some Clustering Terminology
Before delving into the analysis, let's define some terms:
Cluster - A loosely coupled group of application servers that collaborate to provide shared access to the services that each server hosts. The cluster aims to balance resource requests, high availability of resources, and failover of application logic to provide an overall sense of scalability.
Load Balancing - The ability to alter the location where similar requests are handled. For example, a request for a database connection might be handled on any one of four different servers in a cluster. Deciding which server handles the request is up to a load-balancing algorithm. Load balancing algorithms can be round-robin, random, weight-based, load-based, and user definable.
High Availability - A high availability algorithm is a load balancing algorithm that makes a service available as long as the system is operational. Typically, a high availability algorithm forwards all requests to a preferred server until that server becomes unavailable at which point a high availability load balancing occurs to locate another server that is still operational.
Failover - This is the ability for a request that is being serviced to have a high availability switchover to another server without disruption of the service. The goal of a successful failover is to have the service complete without client interaction. Failover is very difficult for middleware vendors to implement successfully since determining the current state of the failed system in the failover server is complicated. Failover is the most difficult feature to implement and the most desired.
How Application Servers Can Instrument EJBs
Vendors can provide four locations for clustering logic for an EJB:
Java Naming Directory Interface (JNDI) Naming Server
The JNDI Naming Server has a bounded home stub. All of these locations have logic that vendors can insert. The home stub and remote stub are the most interesting aspects since both of these objects are downloaded by Java clients and run locally on a remote machine. Vendors can conceivably develop load balancing and fail over algorithms for EJBs that aren't even resident in their own server. Given all of the locations at which vendors can instrument one form of load balancing or failover logic, the permutation of options is enormous.
Figure 1 provides a graphical representation of the following bullets. Theoretically, here is what a vendor could instrument:
JNDI Naming Server - The JNDI naming server is the initial access point for all clients to retrieve a home stub of a session or entity bean. This doesn't apply for EJB 2.0 message driven beans that take a message-oriented middleware approach, however. To make EJBs highly available, vendors can easily replicate their naming servers across nodes in the cluster. Since the home stubs placed into the naming tree are lightweight and serializable, it is easy for vendors to synchronize the trees in a cluster so that all of the servers provide a single, joint naming tree to their clients. Application servers may choose to make their EJB container available on one server in the cluster or in all servers in the cluster. In the former scenario, the home stub replicated throughout the naming server has a hard coded reference to the home skeleton on the host server. In the latter scenario, the replicated home stub can reference the home skeleton on a different node. In Figure 1 (source: BEA Systems, Inc.), this is represented by Node A, replicated naming.
Note A does not encapsulate initial access logic. Replicated naming servers make the home stubs of EJBs highly available to all clients, but distributed systems will still require some form of initial access logic that load balances JNDI InitialContext requests to the different servers in the cluster. Application server vendors typically recommend using DNS round-robin, a proxy server, or some hardware local director. All of these technologies let you represent an entire cluster through a single IP address or DNS name. Each technology internally maintains a table of IP addresses and servers that are participating in the cluster, routing "new" requests using a load-balancing algorithm. Subsequent requests may be re-routed to the server that handled the initial request depending upon the type of request and the technology used.
Container - An application server vendor can provide some load balancing and failover logic directly within the container. For instance, if a container on one server is burdened with requests for a particular type of stateful session bean, the container may be able to forward the request to its counterpart located in another server. For example, if a ShoppingCart stateful session bean container has filled up its cache, and was constantly activating and passivating EJBs to and from secondary storage, it might be advantageous for the container to send all
create(...)invocations to another container in a different server that hasn't reached its cache limit. When the container's burden has been reduced, it can continue servicing new requests. In Figure 1, Node B, replicated beans, represents this.
Given this, some containers can provide minimal failover capability. When a stateful session bean is created, a backup copy can be placed on another server in the same cluster. The backup copy isn't used unless the primary fails at which point the backup becomes the primary and nominates another backup. Every time a transaction on a stateful session bean commits, the bean is synchronized with its backup to ensure that both locations are current. If the container ever has a system failure and loses the primary, the remote stub of the bean fails over to the secondary server and uses the instance that is located there. Stateful session bean replication is a highly desirable feature since it provides non-interruptible access to some business logic. It is not designed, however, to provide persistent access to data since simultaneous failure of the primary and backup servers cannot be recovered. Entity beans should always manage persistent data. Node C, request forwarding, in Figure 1 represents this.
Home Stub - This object is the first object accessed by remote clients and runs locally on a client's virtual machine. Since stubs and skeletons are typically generated by EJB compilers at deployment time, the underlying logic in a stub can be vendor-specific. Vendors can instrument some method-level load balancing and failover schemes directly in the stub. Since the primary action of the home stub is to create or load beans on a remote server, the server in which a bean is ultimately created is not important. Effectively, every
findXXX(...)method could load balance requests to a different home skeleton in the cluster. Node D in Figure 1 represents this.
Remote Stub - This object is instantiated by the home skeleton and returned back to the client. This object can perform the same types of load balancing and failover that a home stub can do, but vendors have to be careful about when they choose to do so. For instance, if a client has created an entity bean, it is likely that the entity bean will only exist in one server in the cluster. It is too expensive to have the contents of the same primary key active on multiple servers in the cluster if there aren't multiple clients all requesting the same entity bean. A remote stub that accesses an entity bean cannot freely load balance its requests to other servers in the cluster since the entity bean will only be active on a single server. Essentially, the remote stub is "pinned" to the server that it came from and is not free to load balance at will. Node E in Figure 1 represents this.
Making Heads or Tails Out of All This
There are many ways to instrument and implement clustered EJBs. By now, you must be thinking, "Holy clustering, Batman! How do I know what to use and where?" The answer to this question lies within the capabilities of any single application server. No one application server vendor supports all of these configurations, but this article should give you a starting point to use when assessing the capabilities of different systems. Also, make sure you consult with your vendor to receive a list of their best practices for EJB clustering. Most vendors provide them as value adds to their clients.
This discussion is only a start to the many facets of EJB clustering. A more detailed and interesting discussion is to analyze the scenarios where it is permissible and (more importantly) impermissible for remote stubs to perform load balancing and failover freely. The type of EJB and its behavior directly correlates to the depth of clustering capability of a remote stub.
In my next article, I will investigate the nuances of session bean clustering for stateful and stateless session beans. The article will analyze remote stub load balancing, discuss the differences between idempotent and non-idempotent methods, and introduce the concept of user-definable, programmatic stub load balancing.
Tyler Jewell , Director, Technical Evangelism, BEA Systems Tyler oversees BEA's technology evangelism efforts that are focused on driving early adoption of strategic BEA technologies into the ISV and developer community.
Return to ONJava.com.