Suggestions for new infrastructure
Enough introduction. The meat of my talk is a list of issues that I'd like free software developers to work on.
A low-overhead identification system
Let's start at the first stage of a typical peer-to-peer application. The various actors have to find each other, which requires identification, routing, and resource discovery. We know that instant messaging systems and Napster had weak solutions to this problem, because they depended on one mega-database and a single or replicated server. Better solutions can be seen in standard email and Jabber, because they rely on servers that are normally close to the user. Each user could even run his or her personal server, theoretically. This is not a full solution, though, because each user still depends on a single server.
When I ask researchers about providing distributed identification, they all say, "Chord!", a research project at M.I.T. It does look very elegant. The research papers stress the speed of a look-up (it takes place in O(log2 n) time) and how robustly the system responds to host failures, but I haven't seen anything about its memory usage or the overhead of maintaining the system as peers enter and leave.
The IETF (Internet Engineering Task Force) is also working on mobile computing, but its solution is oriented toward mobile phones--a very particular solution designed for a particular industry. It embodies all kinds of assumptions tied to cellular phone systems: assumptions about the distribution of servers, the authentication between server and client, and so on.
What I'd like to see is a flexible identification system for mobile and intermittently connected users that doesn't tie the user to one server, but allows a user to specify multiple alternative base systems. That is, you can register with several systems that have fixed IP addresses or DNS names, and then tell people to check with those systems to find you. If one system is offline or out of date, one of the other systems should know.
This system is layered on top of DNS. DNS is great at what it does, and my suggestion would add an extra hop, or several hops, to support the idea of presence for the mobile user. Perhaps a system involving lots of hops would help to hide the communication from surveillance, as onion routing does. But that's not its main goal.
A new, network-aware routing layer
Routing is another area stretched by a lot of peer-to-peer systems. The designers have found that routing at Layer 3 is not sophisticated enough for their kinds of applications, even once they've found someone's IP address.
For instance, take the common application of file sharing, and assume that one person who has the file you want is connected to the same router through the same ISP, while another person is on a different continent. Which would provide the file more efficiently?
Peer-to-peer applications demand more network-aware routing. It would be nice to know what the response time and throughput on various routes were before choosing how to send a large chunk of data. Large businesses and ISPs do this now through Route Optimization, but decisions are made based on the granularity of the interface. If the business wants different Quality-of-Service for different traffic, it has to route traffic through different interfaces. Peer-to-peer routing should take place at a very fine granularity.
This project could benefit from standards, and even a new networking layer right below the application layer. Some intelligence may be specific to each application, so there may be a limit to how much we can standardize application routing. It certainly deserves research.
Routing requires an identification system, so you can specify where you want traffic to go. Correspondingly, an identification system isn't any good unless you have a way to route traffic from one identified party to another. So identification and routing are actually the same problem.
Protocols for data exchange and developing structure
In the next step of peer-to-peer computing, the peers need to exchange data, and that involves protocols on many levels. The main protocol trying to establish itself as a standard right now is JXTA. I don't know whether it will succeed. What Sun did is release a reference platform, and you know what those are like. JXTA seems to be even cruder, less efficient, and buggier than most reference platforms because Sun rushed it out. So whenever I talk to a company working on a peer-to-peer product, I ask them about JXTA, and they always say, "We're looking at it."
A few days before coming here, I saw an announcement for "the first comprehensive framework based on Project JXTA technology" (VistaRepository from VistaPortal Software). That's a good sign.
Web services are hot right now, but their infrastructure has turned out big and scary, which is not what the Web was meant to be. XML-RPC was elegant, but SOAP tries to cover every eventuality. You hardly get a chance to say what you were going to say by the time you've said everything you have to say about what you're going to say. And in trying to solve the presence and discovery problems, Microsoft and IBM and others have created an enormous superstructure resembling CORBA or COM, all implemented between angle brackets. Well, it may serve a need right now; it's nice to be able to add some automation to your Web site. But I can't believe that Web services are the path peer-to-peer will take in general. Besides, I don't believe in port 80 pollution.
Some of the necessary protocols are meant for structuring content. Because XML provides a universal way to structure content, it's about the most pervasive technology in computers right now. Structure lets computers be intelligent about content: they can sort it by various criteria, extract what a person cares about while ignoring crap they don't care about, find commonalities among people by checking public profiles, and do a million other things.
What's special about metadata in peer-to-peer is that users should be able to invent their own tags; these tags should emerge through a grassroots, bottom-up process. Let's say you have a political discussion and people would like to rate politicians. They may come up with tags to express the politician's commitment to the environment, his or her attitude toward immigration, and so on. I would love to see systems that help people negotiate what tags they should use for any occasion.
When we think about supporting peer-to-peer applications, we can't stop with standard computer systems; this is going to become a world of small devices. I find the idea of having an Internet-connected sensor attached to my furnace a little frightening. If I screw up on my computer, I can reload my back-up disk, but if I screw up on my furnace, my neighbors may not live to forgive me. As the American humorist Dave Barry wrote, "I don't want my appliances to be smarter than me. They should be stupider than me, like my politicians and my children."
But people will insist on using devices and sensors, so we should be there to support them. There are some fairly well known issues you have to address to write applications appropriate for devices, such as adopting a Model-View-Controller design pattern so elements can be reused in very different visual or non-visual environments.
What I think will make or break the device market is ease of customization. People should be able to program their devices; they should be able to write a script and download it to the device. So I think devices provide an exciting new platform for scripting languages. And the best scripting languages are free software; somebody should port them to new devices as they come along. Or develop new languages.
The reason I talk about new languages is that applications for devices have different needs from standard computers. On the furnace device, for instance, I imagine that date and time are critical concepts and should be easy to manage. I use the Date/Time module in Perl all the time for my own applications, but it's rather bloated for a device. Maybe it's time for new languages or at least new libraries for devices.