How to Publish Multiple Websites Using a Single Tomcat Web Applicationby Satya Komatineni
Knowledge Folders is a web application that holds and displays content for multiple users. I had been wondering if I could expose the content from this single web application as multiple websites with their own domain names. Could I use virtual hosts to do this? Or would I need to use reverse proxies? How and where would I register domain names? What entries would I need to make in Tomcat configuration files? How would I handle emails for these independent domains? What else would I need to do in my web application? What would the end result look like?
After a few weeks of effort, I was able to expose Knowledge Folders as multiple websites with their own domain names. It turned out I didn't need to go to reverse proxies for now, and could use virtual hosts instead. I was able to get my multiple domain names from GoDaddy.com. I was also able to use Tomcat host/alias settings to effectively route traffic from all of these domains to the same web app. Using the index.jsp of the web app, I was able to separate the content between different domains. After all of this effort, I ended up with a way to publish online websites very quickly and expose them as their own domains. The resulting websites have a number of features that static websites can't accomplish easily.
I wrote Knowledge Folders a few years ago as a workaround for keeping my notes online. I used to keep these "rapid notes" using Microsoft Outlook and a few macros. In particular, the ability to file these notes into classified folders appealed to me. When I took that application to the web, it was natural for me to make it a multiuser system that allowed a number of users to manage their own notes and perhaps share them as well.
Originally, these notes were various SQL scripts that ran on a database. The initial release even had an execution engine to run these notes against a target database and return the results. I abandoned this later to focus on the transformation to come.
At about the same time, I was in search of something to document my open source tool Aspire/J2EE. I was looking more along the lines of wikis and weblogs. Unhappy with what I found, I changed Knowledge Folders to focus on documenting open source software. At this point, Knowledge Folders was basically a collection of accounts (or users), files, and folders in which knowledge was created and classified (hence the name).
Later that year, I introduced the idea of master pages (background HTML, similar to tiles) to give a facelift and proper presentation to the content. This took Knowledge Folders toward a content-management system, where content could be portrayed with appropriate backgrounds.
Later, I added some collaboration features and task management for individual users.
During this time, there was a single domain through which users accessed their accounts. Although not difficult, it was awkward to pass the account URLs for individual users to their friends or any intended audience. I wanted to expose each user as his/her own domain.
My Original Thought Involving Reverse Proxy Servers
Initially, the problem sounded like a case where I could have the individual domains pointing to a "piece of code" on the server that would in turn read the content from a single web app. This intermediate piece of code would somehow associate the incoming domain name to an account in Knowledge Folders and use Knowledge Folders as a source/sink to read/write web pages. In essence, it would be working like a proxy to the actual Knowledge Folders.
Wikipedia's definition of Reverse Proxy implied it could be used for this purpose. (In fact, this may turn out to be a good solution in the future if I were to segregate content further. Perhaps the document Reverse Proxy Patterns PDF by Peter Sommerlad [PDF, 328 KB] might throw some light on the possibilities.) I also hoped to use the reverse proxy facility of Apache to accomplish my goal.
Although I have mentioned the links on reverse proxies here for further research, I will mention the key elements briefly.
What Are Reverse Proxies?
Reverse proxies are web servers that stay in front of other web servers, possibly internal to a corporate network. This indirection is useful in a number of situations. These proxy servers typically read or intercept communications from a browser and rewrite to the back-end servers. Users are usually exposed only to the domain names of these reverse proxy servers and not to the back-end servers. The reverse proxy servers will, in turn, call some internal servers to fulfill the request. They typically break the incoming IP pipe and open a separate pipe to the target servers. As a result, implementing a proxy server reliably is not easy because it must behave like a genuine target server, while also truly intercepting all of the data and HTTP headers.
What Are Reverse Proxies Used For?
Reverse proxies are routinely used to offload SSL certificates. In this scenario,
https traffic is routed to a reverse proxy server. The reverse proxy server converts the traffic from
http and then forwards that request to an HTTP internal server. In this approach, a single reverse-proxy server can be used to offload SSL (and hence save certificates) to multiple back-end servers. Nevertheless, sometimes this approach poses issues for
sendRedirect on the target server. When
sendRedirect is used, sometimes a relative URL is translated into an absolute URL using the wrong scheme (
https). Fortunately, this can be resolved by rewriting SendRedirect.
Reverse proxies can also be used to expose a single domain for multiple web applications on the back end. Each separate server can be mapped to a path based on the main domain. There are also approaches that provide role-based security using Proxy server gatekeepers by monitoring every URL.
Implications To Web Application Development in the Face of Reverse Proxies
It is imperative for all of the URLs to be relative for reverse proxies to work well. This is because the reverse proxy is rewriting the page using a different (and typically external) name. Internal names are unknown to the outside world. So, URLs on your web pages delivered by back-end servers should typically read:
What Are Virtual Hosts?
Although a solution involving reverse proxies seemed possible, I found out that the hosting facility at Indent that I use hosts my web app on Tomcat, not Apache. After some initial research, I couldn't figure out whether Tomcat supported reverse proxies, so my exploration led me to virtual hosts--maybe they could solve the problem.
A virtual host allows multiple domain names for a given IP address. In other words, a given IP address can have any number of host names. When requests are received on behalf of these host names, a web server can decide to deliver content from different root directories, or different web apps in the case of Tomcat.
For example, you could have an arrangement where
- www.host1.com points to /webapp1
- www.host2.com points to /webapp1
- www.host3.com points to /webapp2
In Tomcat, the host names and web apps are bound in a many-to-many relationship. There will be one host entry for each host. When multiple host names are bound to the same web app, one can use Tomcat's aliases facility.
Examples of Virtual Hosts in Tomcat
Based on this, here is a sample setup for Knowledge Folders:
<Host name="www.knowledgefolders.com" appBase="D:/webpage_demos/akc" unpackWARs="true" autoDeploy="true" xmlValidation="false" xmlNamespaceAware="false"> <Alias>knowledgefolders.com</Alias> <Alias>www.knowledgefolders.net</Alias> <Alias>knowledgefolders.net</Alias> <Alias>www.knowledgefolders.org</Alias> <Alias>knowledgefolders.org</Alias> <Alias>www.satyakomatineni.com</Alias> <Alias>www.kavithakomatineni.com</Alias> <Context path="" docBase="D:/webpage_demos/akc" debug="0" reloadable="false"/> <Context path="/akc" docBase="D:/webpage_demos/akc" debug="0" reloadable="false"/> </Host>
Notice how all of the following host names point to the same web app,
akc (which was the previous name for Knowledge Folders).