David Anderson: Inside SETI@Homeby Richard Koman
The numbers that the SETI@Home project have racked up are truly amazing. As David Anderson, the SETI@Home lead who now works at United Devices, revealed at his speech at the O'Reilly P2P Conference yesterday, the project has 2.7 million users in 226 countries. "We've accumulated 500,000 years of CPU time. Our rate of computing is 25 teraflops, which is twice the speed of IBM's ASCII White, the fastest supercomputer in the world). We've analyzed 45 terabytes of data." They handle all this with a staff of three to five people.
SETI@Home is the most popular and well-known distributed computing project on the Net, but it was initially embraced with open arms by the larger Search for Extraterrestrial Intelligence project. "When we asked the SETI Institute for money, not only didn't they give us money, they didn't even like the idea of the project. They didn't think it would work with computers connected with modems. They thought we would get hacked and be used to download viruses, which would discredit the project. They were afraid some overzealous user would write their own analysis program and claim alien contact, thus discrediting the project."
Anderson warns other distributed computing developers to expect a similar fate at the hands of whatever establishment they're dealing with. "IT managers feel threatened by paradigm shifts. The only way to counteract this is to have success stories. Don't set expectations too high. Make sure no disasters happen."
Anderson emphasized outreach programs to users as critical to SETI@Home's success. "We've gone to some lengths to educate people as to what we're doing. We go to a lot of trouble to show the intermediate results ... the screensaver shows results, as does the website. The role of graphics on the screensaver is critical. One of the reasons it spread so fast is someone walks past a PC running the screensaver and sees these red and blue things and asks what it is. The next thing you know everyone in the company is running it.
"Users want to feel acknowledged in some way. They want to get some kind of feedback. Even cheap, meaningless things like a certificate of participation that you download and print out and write in your name -- that generated a lot of goodwill. We set up milestones and put up on the web site the names of people who passed the milestones."
Anderson advocates an open book strategy towards users. "Be open with your users -- tell them what's going on even if its not always pleasant. Our servers were often overloaded. We responded by creating a tech news section with a realtime server status page. ... On the other hand, you don't want to tell users too much. There was a virus going around that affected SETI@Home when the screensaver was running. We put that on the site and users interpreted that as meaning SETI was the virus."
SETI@Home is not an open source program. "We didn't have any experience developing software that way. We had concerns about security. We weren't sure how to keep our data private if source is open."
Security is huge issue, Anderson said. "Fifty percent of resources are devoted to security problems. Should the client by self-updating? We decided no; people are more comfortable if installation of software is under their control. The hard part is when we get data back, how do we know it was computed by our program? People have tried to modify the program to run faster, some have modified it to create incorrect answers, some to make it look they're doing a lot of work, even thought they're not, so they could climb up the leader boards.
"It's a fortunate part of the problem we're working on that failure to get a result from a client is not disastrous, getting wrong data is not disastrous."
The SETI@Home client is written in C and C++ on Unix. "Our philosophy was, don't rely on other people's software. The program works on 80 different platforms. The code base is more than 90%+ platform-independent. The main complex was with Windows -- we wanted it to run as screensaver or an application or a command-line program. The screensaver and the application talk through shared memory, keeping the computing in one process."
The project started out with three SPARCserver 10s but was quickly overwhelmed. "We're in a continual process of beefing that up. My advice is to figure out the size limits of things ahead of time. The database server had undocumented limits we ran up against. Think about the mechanisms for dealing with overwhelm conditions. We had 2 million programs bombarding system with requests. We needed to have a way to throttle them."
Discuss this article in the O'Reilly Network General Forum.
Return to OpenP2P's conference coverage
Return to OpenP2P.com.