Tapping the Matrix, Part 2by Carlos Justiniano
Editor's note: In part one of this two-part series on harnessing the idle processing power of distributed machines, Carlos Justiniano explained the current trends in this exciting technology area and drilled down into specifics such as client/server communication, protocols, server design, databases, and testing. In today's conclusion, he covers network failures, security, software updates, and backup.
Network Failures: Expecting the Unexpected
Building network software can seem deceptively simple at times, especially for less experienced developers. In network programming, the complex actions of humans, computers, and networks can cause unexpected behavior that results in a catastrophic outcome, such as a server crash or, worse, a system crash. The key to surviving catastrophes is to plan for them.
Consider these issues:
- The SuperNode server may send a task to a PeerNode client, which gladly accepts the task. Later, its user stops the client software before a result can be returned.
- A PeerNode client may be in the process of returning a result when it loses its network connection.
- The SuperNode server may collapse under the pressure of PeerNode connections, resulting in a crash, which leaves thousands (and possibly, millions) of PeerNodes unable to connect.
Unless carefully planned for, unexpected problems may ultimately destroy a distributed computing project that is ill-prepared to cope with unexpected situations.
It's essential to distribute the same work units to many PeerNodes. If one PeerNode does not return a result, perhaps some other PeerNode will. Should all of the PeerNodes working on the same task fail to return a result, then the work unit is simply marked as incomplete and will be sent to another batch of PeerNodes at a later time. The use of redundancy creates a robust system, where the project is not dependent on whether or not an individual PeerNode completes a task.
You must also take into account the potential failure of a SuperNode server. How will PeerNode clients respond if a SuperNode is no longer available? PeerNode clients must be designed to handle the case of unreachable SuperNodes. If a PeerNode client stopped working (because of a crash), then the moment a SuperNode server becomes unreachable, your entire network might fall apart. >/p>
One way to address this problem is to design your PeerNode so that it accepts connection failures and gracefully retries at a later time. Additionally, build your PeerNode client so that it's able to connect to Internet addresses by name, such as node01.distributedchess.net or node02.distributedchess.net. If a connection to node01.distributedchess.net fails, then the PeerNode will try node02. PeerNodes can also maintain a list of SuperNode servers and migrate to the next available server as needed. Should a SuperNode server fail, PeerNodes will behave like a swarm of bees changing direction on their way to another destination.
Security plays a vital role in many aspects of a DC project. Both project organizers and participants have valid concerns regarding security. Participants have concerns about their privacy and their machine's susceptibility to viruses, and many wonder if using DC software makes their machines easier targets for attackers. On the other side of the fence, project developers have concerns that attackers may find ways to tamper with results, invalidating received work units.
Developers are faced with securing a project from a number of vantage points. A first order of business is to examine points of vulnerability. Where can an attacker cause harm? Which aspects of the project can be exploited and otherwise abused by participants?
Project participants download PeerNode client software from a project's web site, so it makes sense that the project's web server is an obvious target. If an attacker can penetrate a site and replace the downloadable PeerNode clients with compromised versions, then many machines will become infected.
Fortunately, a considerable amount of research has been done to address server security. Intrusion detection systems (IDSes) use sophisticated monitoring techniques to detect potential security issues. An IDS can monitor TCP packets to identify when an attacker is performing a port scan or when a denial-of-service attack is underway. IDSes can also monitor system files and user patterns for unexpected behavior (such as a normal user acquiring root level access), and when core system configuration files are modified. You can configure an IDS to send you an email when a problem occurs. Think of it as an early warning system.
There are thousands (OK, maybe just hundreds) of tools that can be used to monitor network traffic. One such tool is the freely available Ethereal, which uses a packet-sniffing library in order to perform its higher-level functions, such as filtering and display. The same types of underlying tools are available to attackers who can use them to intercept and modify data while in transit.
Take, for example, an attacker wishing to disrupt a DC project. The attacker builds a Trojan software product that masquerades as a useful monitoring and statistics tracking system for end users. The malicious tool performs useful functions while slightly modifying the transmitted results prior to sending them on their way. The tampered data has the potential of completely invalidating the DC project -- resulting in a complete waste of time for all involved. We won't get into the many psychological reasons why some people consider this sort of behavior exciting, but suffice it to say that disrupting a high-profile DC project might offer an attacker icon status in certain circles.
The only hope of protecting your project is to make it difficult for an attacker to modify transmitted data. As with most things, there are easy and harder ways of doing things.
Software developers are sometimes faced with the classical problem of space versus performance. The need to protect data may be sufficiently clear; however, the cost of doing so may be prohibitive. Hiding data, rather then fully encrypting it, and using strong validation techniques on both the server and client end, may offer a suitable compromise.
Data compression can effectively reduce bandwidth requirements, and has a positive side effect of masking the original contents. Applying byte transformations, such as XOR operations and weak reversible data ciphers, will further aid in data hiding. Clearly, data hiding is by no means as secure as data encryption, but may be suitable for use in certain settings.
Widely available implementations of popular data-encryption algorithms leave project developers with little reason not to apply some form of data protection. One popular algorithm is the Advanced Encryption Standard (AES), also known as Rijndael (pronounced "rain doll"). AES is a variable block symmetric encryption algorithm developed by Belgian cryptographers Joan Daemen and Vincent Rijmen as a replacement for the aging DES and Triple DES standards that are still commonly used to secure e-commerce. AES is currently used in hundreds of high-end encryption products and is a favorite among developers. Additionally, AES implementations can be found online.
For maximum security, where performance may be less of an issue, the use of public key cryptography is highly recommended. Public Key Infrastructure (PKI) systems use public key encryption to create digital certificates, which are managed by certificate authorities. Certificate Authorities (CAs) establish a trust hierarchy, which can be used to validate authenticity through association. The use of PKI would allow a SuperNode server to authenticate PeerNodes, and PeerNodes to validate that they are indeed communicating with an authentic SuperNode.
Detecting Software Tampering
Project contributors may acquire PeerNode client software from one of many locations. For example, a project team leader might download and place the client software on the team's web site along with specialized instructions for team members.
There is a certain degree of trust associated when the software is downloaded directly from the project's main web site. However, when project software is made available from different locations, project contributors may not be able to trust the origins and validity of the software. For all they know, the software could be a Trojan program. To address this concern, DC software is often posted on a project's site along with a cryptographic hash string, such as:
The string of numbers and letters is the output of a program called
md5sum, which generates the string of alphanumeric characters when given a filename as input.
End users can download a project's client software and type:
The output is a string that should match the one posted on the download site.
For higher levels of security, some projects sign their files using a private key. Users wanting to validate that a file's digital signature is correct can retrieve the project's public key (available online via public key servers) and use it to validate that the downloaded file. The signature below is an example of what a contributor might see posted on the project's site.
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQA/ye8su9d1K+MjI6sRAtQXAKCgcXahYj1ZcptsXR10WCnSbKs2ggCeK/Qv 4THuyfGOeDEyHiHnHX9pkZw= =Nj+a -----END PGP SIGNATURE-----
The important thing concerning a cryptographic hash and a digital signature is that both techniques can be used to determine whether a file has been tampered with after it was posted. This allows the software to be distributed and for end users to validate that the file has not been tampered with. By far the most secure method involves the use of digital keys, and that technique is being used in an increasing number of projects.
There is a wealth of security information available on the Internet, and many open source projects demonstrate working implementations. As a DC project developer, you have the responsibility to explore security and protect both your project and your members.
Combating Aging: Software Updates
As computer users, we've grown accustomed to automatic software updates. Now companies such as Microsoft, Apple, and Red Hat offer their customers software updates. They're not alone, as thousands of other software companies also offer updates.
Updates are released for any number of valid reasons, ranging from program fixes and new features, to the latest security Band-Aids. In all fairness, software complexity has increased while the time to market has decreased, causing products to be released to an unsuspecting public in a less-than-perfect state. In addition, network-connected software is under siege as attackers attempt to discover and later exploit security flaws. Rapidly distributing software updates has become the only real defense companies have against coping with the unexpected.
A real issue facing developers is how to make software updates quick and painless for their customers. Long gone are the days when posting an update on your company's FTP or web site sufficed. Companies are seeing their products used by a wider demographic, where even a "one-click install" is one click too many.
The challenge affects all DC projects, to lesser and greater degrees. Long-running, static-type projects often do not require software updates. However, for projects that are highly dynamic, where the client code is being updated in response to ongoing research and development, the need to release continuous updates is far more critical.
Pages: 1, 2