Using Jini to Build a Catastrophe-Resistant Systemby Dave Sag
The events of September 11 were a wake-up call to software developers the world over. Of the many large firms that held critical data or even entire data centers in the World Trade Center towers, few had timely offsite backups in place. One firm even did its offsite backup from Tower 1 into Tower 2, reasoning -- like the people who decided to only insure one tower -- that the chance of both towers vanishing was slim in the extreme. Who would have guessed such a thing could ever happen?
Once the shock of the event had passed, the implications for the work we at Pronoic Ltd. had been doing became evident. As I will go on to discuss, we had been very focussed on using Jini to build ultra-high avaliability applications for the insurance industry. Well, Sept. 11 killed the insurance sector, but brought the advantages of self-healing, distributed applications into clear focus. (For more on self-healing, distributed systems see the O'Reilly Network's interview with IBM's Robert Morris.)
Early in 2001, I wrote an article, "Jini as an Enterprise Solution," that outlined some of Pronoic's experiences in developing a simple XML application server that used Apache Tomcat, Apache Cocoon, a JavaSpace, and consequently, Jini. We called this app server Crudlet, standing for "create, retrieve, update, delete, lifecycle, exist, and template." Crudlet grew out of our need for a simple Web interface to objects in a JavaSpace. Of course, that is not enough, and the Crudlet app server grew to include a set of agents that ran independently and used Jini's lookup and discovery protocol to discover the space and perform simple functions.
Eventually, the Crudlet work turned into something we call the Corporate Operating System (COS), which I'll describe later in this article. But first, let's look at the original Crudlet. Logically, the flow went something like this. An underwriter (in our case) would enter the details of a risk they wished to shop out. We would capture the form elements, which, using our Crudlet tag library, would be parsed by Cocoon, and values then inserted into JavaBeans. These beans would fire property change events that would be picked up by a virtual Bean box that populated a primitive JavaSpace entry or entries, which would then be written to the space. Agents would race to the new entry looking for their signature flags. Herein lies the key design flaw in Crudlet 1.0: Each primitive was coupled to expected services by a series of internal flags.
See MailEntry for an example. It has a
mMailableFlag boolean that is used by the mailer service ("pat the postman") to match against. JavaSpaces only work by associative mapping; one side-effect of this is that you can't just match against an interface -- you need a publicly-accessible field to search on, since you can't instantiate an interface to make a template.
It's not such a good design pattern, since adding a new service that should act on an entry means you need to update the entry, which in most cases means flushing the system of all the old entries. During development this can happen a lot.
We were at the Jini Community Conference in Amsterdam in December 2000, arguing this very point over a few beers, when suddenly the solution became obvious. We devised a scheme whereby we use ultra-lightweight
ServiceFlag entries to notify services of the entry they need to act on. This idea quickly led us to a napkin sketch design of a system where, in the case of a
create (entry) being called, we'd write a flag into the space. This flag would be collected by a distribution service whose job it is to know what services need to scoop up this newly created entry. This distribution service would simply keep an internal hashtable of known ensemble configurations obtained on startup from an XML file. This lets whole ensembles run asynchronously and yet remain coordinated and very structured. Persistence is maintained via transactions and the JavaSpace acts as the focus of coordination.
So Many Issues
This leads to the second issue: the Crudlet doesn't have a well-defined set of actual interfaces. This was addressed to some extent in the last build of the Crudlet and Tennis source code before the effects of Sept. 11 finally closed down the Risk2Risk project development team. However, it became rapidly evident that we needed a concise interface to the entire ensemble.
Thirdly, the XML tag library itself required urgent review, in light of the significant update to the Cocoon XML processing system (from version 1 to version 2), and the emergence of new utilities, such as JAXB and Castor, for converting Java objects to XML and back again. By embracing the new, retrofitting the old, and following good eXtreme Programming techniques, we were bound to scrap a majority of our code in favor of a new architecture.
Lastly, although Crudlet and Tennis had been open-sourced, we as developers had signed away the rights to the use of the name and as such, in combination with the realization of the many faults in the architecture, we decided to scrap it all and start again, building on what we had learned.
A Night with Rio
At the Jini nerd-off Discovery 01, held in a small hotel somewhere in Princeton, N.J., we met Dennis Reedy and Jim Clarke from Sun's professional services team. We were impressed with Dennis' presentation on their Rio project. At Dennis' invitation, Phil Blythe from Pronoic gave a presentation about Crudlet at JavaOne 2001. They huddled for a few days in a hotel room, where Dennis filled Phil's head with radical ideas.
Rio's clever idea is that you specify the services you need in a structure called an operational string. The operational string is an object graph that contains knowledge on what it takes to provision and instantiate services. You can chain operational strings together, creating graphs of inter-related services that are then delivered through the network, offering a capability that an enterprise or organization provides. Operational Strings can be created as a bunch of nested XML documents and fed into a service called a provisioning monitor.
The provisioning monitor provides a ServiceUI -- a UI that lets you load the operational string. It's neat and impresses suits and techies alike because of the funky graphics. Rio provides a structure called a CyberNode, which is like a VM with quality-of-service tags and Jini's ability to automatically link up with other services. The provisioning monitor loads the operational string (an XML file), which tells it which JiniServiceBeans (JSBs) to instantiate and allocate to the various CyberNodes. The provisioning monitor is also charged with the responsibility of watching the JSBs it has provisioned, and reprovisioning any that fall over for whatever reason. Via ServiceUI, the provisioning monitor, and in fact, any of Rio's ensemble visualization tools, you can call up a UI for any of the services running anywhere. Because CyberNodes have associated QoS information, you can intelligently route services to machines very efficiently and get all the automagic load-balancing and failover you need. JSBs take most of the work out of writing Jini services, leaving you with only the core service code. They can be persistent because the CyberNode itself is an activatable, persistent service.
The 2002 O'Reilly Emerging Technologies Conference explored how P2P and Web services are coming together in a new Internet operating system.
Rio offers additional benefits to the ensemble application developer. It provides a peer-to-peer event model, watchable objects, and resource pools such as thread pools, or connection pools. It talks to JavaSpaces nicely with its own space implementation and also provides the capability for a JSB to implement support for both RMI/JRMP and RMI/IIOP objects. JSBs contain a Servant class providing support for CORBA, using the Portable Object Adaptor (POA).
(Now, since I have never gone down the CORBA route, when I read that Rio's JSBs, and hence COS' JBSs, contain such a Servant, I figured I should find out what one is. It turns out that a Servant is a bit like a stupid Jini proxy. Calls to the CORBA objects are routed to the Servant. Anyway, for those connecting to legacy systems, Rio does it. I am so glad I have been living in a Jini world for the last two years. I skipped Cobol at university too.)
Rio also provides a tunnelling service, called Lincoln, that enables dynamic discovery of Jini Lookup Services across networks that are out of multicast range, or do not forward multicast packets.
Announcement and Request packets are tunnelled to either a remote peer that re-mulitcasts the forwarded packet using the original mulitcast group, or to a remote subnet whose router supports directed broadcasts. They've also extended Ant and the Buildtool projects to address some of the most common issues with the development of services surrounding the assembly of JAR files, which include service implementation, download, and user interface funcationality.
As if that weren't enough, Rio provides Dynamic Web Application Archive (WAR) Support. This means you can attach a WAR as an attribute, describing entry points corresponding to JSP responder types (HTML, XML, WML). This capability includes a controller servlet adhering to the MVC (Model-View-Controller) pattern, focusing on the provisioning of the WAR to the Web container, and providing the navigation to direct requests to the appropriate JSP. With this capability, JSBs can deliver Web capabilities on demand.
Rio provides the following tools:
- A viewer with integrated support for ServiceUIs that provides a user interface to any discovered Jini lookup services.
- An operational string monitor that provides the capability to load, deploy, and view the status of operational strings.
- A launcher that provides the user with a simple tool to control Jini service startup parameters and service instantiation.
- A Java WebStart Rio installer to bootstrap a computer resource with the software required to run individual Rio services.
- An accumulator viewer: a ServiceUI that provides the facility to view watches that have been created for services.
You can download and read up on Rio at jini.org (free registration required). The project is very active and gathering quite a bit of excitement.
Rio's underlying technology provided many of the building blocks we needed to realize our design goals, which by now had been formalized as follows:
- Create an inherently distributable application infrastructure that meets all enterprise criteria: security, interoperability, scalability, robustness, and simplicity.
- Make client/server/service programming as simple and scalable as possible. This means shielding the developer from all issues surrounding the coordination of services (discovery, lookup, failover, etc.) and creating a rapid development environment.
- Create an environment where services can execute commands asynchronously from each other and also from the client that issued the command. This allows time-consuming and computationally-heavy exercises to be queued and consumed by available and competing services.
- Develop an infrastructure where all applications and services are truly decoupled, and where each component can be shut down, modified, and restarted with zero impact on other running components.
- Develop a set of user interface tools to manage and administrate applications running across an ensemble of services.
That is, enable applications that keep running with no loss or corruption of data, even despite the physical loss of one or even several of the actual computers running the application; in other words, applications that are Sept. 11-proof.
We began development of a Rio-based version of the system we had sketched out on a napkin in Amsterdam many months ago. We dubbed this system the Corporate Operating System (COS). COS builds on Rio to provide an environment where multiple ensemble applications can coexist using shared resources and operate securely in a distributed computing environment. A COS application is a named collection of clients and shared data, together with a an ensemble of services, that can be submitted to a resource provision grid for execution.
Pages: 1, 2