oreilly.comSafari Books Online.Conferences.
Articles Radar Books  

Next-Generation File Sharing with Social Networks

by Robert Kaye

Editor's note: At the recent O'Reilly Emerging Technology Conference in San Diego, CA, Robert Kaye lead a talk on Next Generation File Sharing with Social Software. For those who were able to attend, this essay builds upon that session. And if you missed the talk all together, you can now get up to speed.

Open file sharing systems like Kazaa welcome everyone on the net and enjoy a broad selection of content. The selection is so vast that Cory Doctorow calls it "The largest library ever created." (Personally, I'd call it the "largest and messiest library ever created," but that is another essay entirely.) However, this vast selection comes with a significant risk attached -- outsider attackers who want to stop you from sharing files and would like to throw you in jail or pilfer your college fund.

The natural reaction is to run away and hide from the bad guys and play in your own sandbox that the bad guys cannot even see. Due to the recent massive lawsuit waves from the bad guys, there is more talk than ever about Darknets, which are networks that hide themselves and their members from public view.

Combining file sharing applications with social networks enables people to create a trusted network of their friends to keep out the bad guys. The definition of bad guys is up to the user to determine -- in a lot of cases, the bad guys would be the lovely folks slinging lawsuits. But these networks can easily be used for legitimate non-infringing uses, such as sharing personal information with a network of friends while keeping it out of reach of marketers and identity thieves.

Social networks designed for file sharing should focus on three goals: share your files with others in your network, discover new files from other members, and protect the network from outside attackers. To achieve these goals, the social network needs to be founded on a well-defined social model.

Social Models

To find social models that can be employed for these next generation networks, we can look toward human evolution. Jared Diamond's perspective on human evolution, as told in Guns, Germs and Steel, points out that humans first formed hunter-gatherer tribes in order to share the burden of food production. As tribes grew in size, they combined to create chiefdoms, and from there they created states like those in which we live now.

To apply this concept, the network starts with a group of trusted people forming a tribe of people. Starting a tribe as a friendnet, where each connection is backed up by a meatspace connection, is an excellent starting point. However, sharing files inside of a small tribe is only interesting for a short while because it presents a limited search horizon. If tribes connect with other tribes to form chiefdoms, the search horizon expands with each new connection in the chiefdom. Finally, connect chiefdoms to other chiefdoms to form states, and the search horizon may start to look similar to the search horizons in open file-trading systems.

Each tribe should carefully select tribal elders who will set the tone of the network and determine social policies for the network. The elders should be aware of the tribal members and their strengths and weaknesses in order to set policies that are effective for the group. The elders should focus the tribe on its primary goals and continually evaluate the state of the tribe to ensure that its members are well educated on the tribal policies.

Tribal elders must be aware that outside attackers can use social attacks on the network. For instance, if a number of members of a movie-swapping tribe are hanging out at their local coffee shop, they should be aware that attackers may appear as smooth-talkers with lots of knowledge about movies and claims of having a large collection of relevant movies. If one tribal member falls for the attack and invites the attacker into the network, the entire network is at risk. We'll go into the risks from attacks in more detail later, but tribal elders need to understand these risks and educate their tribe to act accordingly.

The tribal elders are the guardians of the network who should use their awareness of the network and its members to continually reevaluate the relationships between members and other tribes. These elders should select or design the appropriate social policies for their tribes and oversee privileges of their members as members establish (or destroy) their reputations.

Social Policies

Social policies dictate who can be invited to the network; how must the reputation of a potential member be verified, if at all? What other tribes can this tribe link to and trade with? Is it OK for the tribe to end their questions in prepositions? What structure is appropriate for the tribe? A loose collaboration or a rank-and-file hierarchy?

All of these questions will influence the social policies of the network, and unlike open file-trading systems, care must be exercised when creating and expanding networks that are designed to keep out attackers. The social policies of these networks have a direct impact on the security of the network. A loose network with few rules and lax reputation verification is more susceptible to compromise. A tight network with many access controls will be more secure, but it will have more restricted search horizons. The key for the tribal elders is to pick a set of policies that balances security with the utility of the network.

The social policies also determine what sort of social network will be created. Loose connection policies will yield more chaotic systems that look like Friendster, and more refined policies will yield systems that resemble systems like LinkedIn. Social policies will need to address the most pressing social issues before they arise. For instance, Friendster should have anticipated Fakester accounts and set a policy for these accounts before it ever opened its doors. Changing terms of service and social policies radically after a network has been formed only serves to alienate its users.

Search Horizons

One of the drawbacks to using a social network to enable file sharing is that the search horizons will be more limited in comparison to Kazaa/Napster/et al. There will be fewer people in the network and you will not have terabytes upon terabytes of data. Is having the world's largest, messiest and duplicated library going to help you discover new items of interest?

Not likely -- I think that file sharing through social networks enables users to explore their strong and weak ties. Random connections in P2P networks are not even weak ties -- they are random ties. Exploring the weak ties in your network is likely to give you access to more relevant information/content than a random tie. People tend to associate with friends with whom they share some common bond, and this common bond is likely going to result in some shared tastes.

Perhaps these social networks can influence some change and shift users away from a "I'm looking for this track!" mentality to a "What are my friends listening to?" mentality. Napster exemplified this focus on quantity; it is time to consider quality above quantity and use the network for discovery as well as sharing.

Architecture: Central Server

At the heart of this system lies a central server that implements the social network features. This server would implement a generic social network system via web services that could be used to create open social networks like Friendster, or Darknet applications like underground apple-pie recipe trading. This central server would be used for identification, authentication, availability, and network relationships of users. The server should not know what the social network is being used for -- a legitimate application should look exactly the same as an infringing application to the outside world.

P2P advocates will be quick to point out that a central server is a weak link in the system -- both from a technical and an outside attacker perspective. Granted, the server is a central point of failure, but so far, algorithms that implement a distributed web-of-trust have not come of age. As far as I can see, there isn't a solid solution for implementing a distributed social network that is resistant to outside attacks -- yet.

From a legal attackers perspective, the central server presents no useful information. Should a server be compromised, the legal attacker would find no proof that any illegal activities were happening. In fact, the central server should contain no incriminating or otherwise useful information about the social network. The most useful thing gleaned from the central server would be the IP addresses of other members of the network -- that's all.

This approach has two other benefits: legal attackers cannot use direct or vicarious infringement attacks on the server, since the server cannot know if the networks are used for infringing uses. Also, the central server solves the pesky P2P bootstrapping problem of finding the network to join. Here the central server will be able to give clients the IP addresses of other members who are currently online.

Architecture: P2P Client

To build a P2P client for this network, an existing client could be employed or a new one could be developed. All of the learning from P2P research in the last few years could be applied to creating a high-tech client that uses best-of-class software like BitTorrent and Kademlia. Given how many good P2P systems are floating about the world today, it is clear that this is not a difficult problem.

The P2P client could employ a Gnutella-like query-routing protocol or use external identifiers like Bitzi's Bitprints, MusicBrainz identifiers, or IMDB identifiers, coupled with a distributed hash table like Kademlia.

The system should undoubtedly use a system that automatically creates the BitTorrent trackers to maximize the bandwidth utilization of the file sharing clients.

No rocket science here, move along.

Invitations and Detection Avoidance

To join a social file-sharing network, you will need an invitation from an existing member. Invitations are simply small XML files that contain the right keys for joining the network. The invitations may also specify the right parameters for finding the network, since Darknets do their best to not operate out in the open.

First off, all traffic flowing through the social network, including file transfers, should be tunneled via SSH, so that someone sniffing your network connection cannot tell the difference between a legitimate VPN connection or an infringing trade of the hottest apple-pie recipe.

Second, the applications that form the social network should attempt to blend into the landscape and either be invisible or indistinguishable from normal network infrastructure, such as an SSH server. The easiest form of this is to operate on the same port as the SSH server itself. A more complicated approach of Port Knocking was recently proposed on Slashdot -- it requires a series of predetermined failed connection attempts to the server before the server opens the real port for the client.

Another approach is port changing, where the server and the client frequently switch ports on which they listen to for connections. The invitation could include the parameters needed to calculate which port a server would be listening on for any given time.

Regardless of which technique is employed, the goal is the same: outside attackers see nothing but SSH connections.


The applications that make up the social network should employ standard off-the-shelf tools like SSH, PGP and BitTorrent. After all, these tools specialize in their respective areas, and it is not wise to reinvent the wheel -- especially when it comes to security. Any network connection made by the file-sharing software should be tunneled via an SSH connection.

The baseline security model of this software should be to revert back to the same security of an open system in the case of a system compromise. If the system is busted wide open for some reason, only the IP addresses of the participants should be exposed. In today's legal climate, having solely an IP address forces the attackers to file anonymous John Doe lawsuits. This is exactly the same procedure reserved for people who use open systems like Kazaa.

This fact gives users of social software file-sharing applications a leg up on file sharers using Kazaa. Mounting an attack on Kazaa users requires freely available and easy-to-use network tools. Mounting an attack on a network fortified with SSH requires vastly different tools and a brute-force attack is out of the question. Thus, the attackers are more likely to stick to pursuing the users of open file-trading applications.

The most vulnerable part of a social network is the users themselves. As security experts have been saying for a long time, most successful attacks are not technical attacks, but attacks that exploit the weaknesses of the users. Passwords jotted down in insecure locations, or smooth-talking attackers convincing users of their benign nature, present far greater weakness than the SSH protocol.

Ultimately, the security of the network lies in the hands of the users. This is why the social policies set by the tribal elders are so important -- the policies affect the mindset of the user, which in turn affects their behavior. Social policies that permit promiscuous behavior can lead to security breaches.

Attack Model

Analyzing the possible attacks on a social file-sharing network gives us three possible attacks:

  1. Server attack: The central server gets hacked, raided by legal attackers, or otherwise compromised. Since the server operates blindly with respect to what the clients are doing, the server contains no incriminating evidence. The attacker cannot tell a recipe-trading network from a movie-trading network. At worst, the IP addresses of the members can be exposed and those must be pursued with a John Doe lawsuit.
  2. Client attack: A client gets hacked, raided by legal attackers, or otherwise compromised. The compromised client could potentially continue operating and collect the IP addresses of everyone in the network. Incriminating behavior could be observed.
  3. Social client attack: An attacker gets invited to the network and starts participating in the network. Over time, the attacker can collect all of the IP addresses of the members and possibly observe incriminating behavior.

At worst, the server attack yields IP addresses that may not have committed any infringement. Client attacks expose the IP addresses and possibly allow the attacker to observe infringing activities. While this model may seem catastrophic, it's better than the open P2P system model that Kazaa uses. Given that attackers are likely to attack the easy targets first, using a social network to share files presents an increased level of security -- for now.


Guns, Germs, and Steel: The Fates of Human Societies by Jared Diamond

Smart Mobs: The Next Social Revolution by Howard Rheingold

Urban Tribes: A Generation Redefines Friendship, Family, and Commitment by Ethan Watters

File Sharing goes Social by Clay Shirky

Should a time come when all open systems have been eradicated, this system will need extra fortification. As the much discussed web-of-trust algorithms and anonymization algorithms come of age, these algorithms should be adapted for use with social file sharing to continually improve the attack resistance of these networks.


Over the past few years, we've learned a number of legal and technical lessons that allow us to build more secure and effective file-sharing systems today. Using detection-avoidance schemes and common security tools like SSH and PGP forces the attackers to take a different track when attacking next-generation file-sharing systems. Attackers must now employ social attacks to take down file-sharing systems, and social attacks don't scale as well as online attacks that can be assisted with computer tools.

The security model presented here is only sufficient for a limited time -- over time, more advanced web-of-trust algorithms should be used to further mitigate the damage of a compromised network.

Finally, it should be stressed again that the security of a social network grows out of the social policies set for the network. Tribal elders and members of the network need to be continually vigilant to keep the network safe from outside attackers.

Robert Kaye is the Mayhem & Chaos Coordinator and creator of MusicBrainz, the music metadata commons.

Return to OpenP2P.com.

P2P Weblogs

Richard Koman Richard Koman's Weblog
Supreme Court Decides Unanimously Against Grokster
Updating as we go. Supremes have ruled 9-0 in favor of the studios in MGM v Grokster. But does the decision have wider import? Is it a death knell for tech? It's starting to look like the answer is no. (Jun 27, 2005)

> More from O'Reilly Developer Weblogs

More Weblogs
FolderShare remote computer search: better privacy than Google Desktop? [Sid Steward]

Data Condoms: Solutions for Private, Remote Search Indexes [Sid Steward]

Behold! Google the darknet/p2p search engine! [Sid Steward]

Open Source & The Fallacy Of Composition [Spencer Critchley]