The Swarmcast Solutionby Richard Koman
In the Gospel According to P2P, peer-to-peer technology plunges the World Wide Wait into the dustbin of history. In the bad old client-server model, popular content becomes less available the more popular it is. That's because hot content -- say, the trailer for "The Fellowship of the Ring" -- is hosted only on the official site (1.7 million downloads on first day of operation, boasted a press release), which quickly becomes overwhelmed with the sheer number of requests for the content. Eventually, the site stops responding altogether, and the only way to get the content is to wait for the demand to lessen, or to download it from a friend's unofficial ftp site. The mainstream web response is to throw lots of hardware and load-balancing software at the problem, but that takes money and expertise.
The Napster and Gnutella answer to this problem is that everyone who downloads the trailer can become a server of the trailer. In this view, P2P means lots of peers serving lots of content; one of them will probably be available and able to serve content at any given time. The problems here are well documented; once out on these networks, the content is out of the creator's control. Releasing a trailer or a song on these systems means the owner doesn't know how popular the content is, can't collect a fee per download, can't apply log analysis, can't expire the file, and so on. Worst of all, once you've found the source of your file, you're still dependent on their bandwidth, current load, and ISP reliability.
Freenet's answer is to make copies of the file as it bounces from node to node; the next time the file is requested, the closest node with the file -- rather than the content originator -- provides it to the requester. In addition, files are held in the recently-used cache, so more popular content is more available, while less popular content tends to disappear. The problems with Freenet mostly center on the system's anonymity features; Freenet creator Ian Clarke's startup Uprizer is trying to commercialize Freenet; by dropping anonymity.
Other companies, like Static and Allcast, apply P2P to streaming media, exploiting peers' unused bandwidth to redistribute streaming media.
Yesterday, OpenCola announced a public beta of Swarmcast -- a system that is strongly P2P but with a fascinating twist. Rather than dealing with whole files or with streaming solutions, Swarmcast uses the P2P network to divide very large files into small (32K) packets. Using a technique called forward error correction (FEC) encoding, Swarmcast randomly requests packets from the machines that are hosting the content.
The cards ain't worth a dime if you don't lay 'em down
In trying to explain FEC encoding, something he admits is difficult, Swarmcast inventor and lead developer Justin Chapweske draws the analogy of a poker game. "It's hard to get a straight flush out a deck by just drawing random cards. But if every card in the deck is wild, it's easy -- any five cards will do." In other words, Swarmcast just starts asking for packets, randomly, from anyone that has them. When it gets all the packets that make up a file, it decodes them and the file is ready to play. "It's kind of like a hologram, where each packet contains an representation of the whole image."
But when you're randomly grabbing packets, won't you come up with lots of repetition -- multiple copies of the same packet and no copies of others packets? No, explains Chapweske: "We create a set of packets that are much bigger than the original data. The space of unique packets is much larger than the file itself, so the probability of getting the same packet is very low, and we only need a subset of all packets to reconstruct the original file."
Under Swarmcast, the first clients "initially contact the content provider's machine and then look for other machines hosting the content," explains Chapweske. "It contacts those, then crawls across the P2P network looking for others." Once there are "enough peers," clients stop bothering the host altogether.
How many peers are "enough?" Chapweske explains, "It depends on the speed of the content provider -- if the original provider's site is already slow, it will switch immediately to the network; if the site is fast, it won't switch over as quickly. Eventually, the provider will be able to control this."
|Sounds way cool, but how did it work for you? Download Swarmcast and let us know what your real-world experience was.|
But here's the most intriguing part: since users are likely receiving different packets, and they can start reencoding packets as soon as they receive them, one user can send packets to other users even while still receiving the file. That means that the effective download rate is not diminished, even though any individual machine might be serving data a relatively low rate.
Think of Swarmcast as a pyramid of computers. At the top of the pyramid is the original content provider. As Chapweske explains it, the top machine has a throughput rate of 100K/second, and there are two clients receiving packets at 50K/second each. But each of those clients are also sending packets to each other (remember, there's a high likelihood that they each have non-redundant packets) at 50K/second. So each client still gets the file at 100K/second.
Swarmcast currently runs on Java Webstart, a cross-platform plug-in system. That's a 5MB download and it's part of the process OpenCola isn't happy with right now. "The Swarmcast technology is ready; we're improving a lot of the deployment capabilities right now," Chapweske says. After you've installed Java Webstart, Swarmcast itself is about a 700K download. The installation process on the Swarmcast site is pretty straightforward. Once installed, you can click on a link to start downloading a sample 20 MB movie trailer using Swarmcast.
The site also features a form for content creators to make their content available on Swarmcast. The site generates a URL that the content provider can put on their site. "We're not publishing any of this information about registered files. We're trying to avoid this turning into a bunch of kiddies turning this into a file-sharing thing."
OpenCola's long-awaited Folders (latest release date: before end of Q2) will use Swarmcast as a content distribution mechanism. For more on Folders, see Andy Oram's profile on OpenCola.
Swarmcast is distributed under the GNU Public License.