The science of SETI@home
SETI (Search for Extraterrestrial Intelligence) is a scientific area whose goal is to detect intelligent life outside Earth. One approach, known as radio SETI, uses radio telescopes to listen for narrow-bandwidth radio signals from space. Such signals are not known to occur naturally, so a detection would provide evidence of extraterrestrial technology.
Radio telescope signals consist primarily of noise (from celestial sources and the receiver's electronics) and man-made signals such as TV stations, radar, and satellites. Modern radio SETI projects analyze the data digitally. More computing power enables searches to cover greater frequency ranges with more sensitivity. Radio SETI, therefore, has an insatiable appetite for computing power.
Previous radio SETI projects have used special-purpose supercomputers, located at the telescope, to do the bulk of the data analysis. In 1995, David Gedye proposed doing radio SETI using a virtual supercomputer composed of large numbers of Internet-connected computers, and he organized the SETI@home project to explore this idea. SETI@home was originally launched in May 1999.
The Problem — Mountains of Data
|Most of the SETI programs in existence today, including those at UC Berkeley build large computers that analyze that data from the telescope in real time. None of these computers look very deeply at the data for weak signals nor do they look for a large class of signal types (which we'll discuss further on...) The reason for this is because they are limited by the amount of computer power available for data analysis. To tease out the weakest signals, a great amount of computer power is necessary. It would take a monstrous supercomputer to get the job done. SETI programs could never afford to build or buy that computing power. There is a trade-off that they can make. Rather than a huge computer to do the job, they could use a smaller computer but just take longer to do it. But then there would be lots of data piling up. What if they used LOTS of small computers, all working simultaneously on different parts of the analysis? Where can the SETI team possibly find thousands of computers they'd need to analyze the data continuously streaming from Arecibo? |
The UC Berkeley SETI team has discovered that there are already thousands of computers that might be available for use. Most of these computers sit around most of the time with toasters flying across their screens accomplishing absolutely nothing and wasting electricity to boot. This is where SETI@home (and you!) come into the picture. The SETI@home project hopes to convince you to allow us to borrow your computer when you aren't using it and to help us "…search out new life and new civilizations." We'll do this with a screen saver that can go get a chunk of data from us over the internet, analyze that data, and then report the results back to us. When you need your computer back, our screen saver instantly gets out of the way and only continues it's analysis when you are finished with your work.It's an interesting and difficult task. There's so much data to analyze that it seems impossible! Fortunately, the data analysis task can be easily broken up into little pieces that can all be worked on separately and in parallel. None of the pieces depends on the other pieces. Also, there is only a finite amount of sky that can be seen from Arecibo. In the next two years the entire sky as seen from the telescope will be scanned three times. We feel that this will be enough for this project. By the time we've looked at the sky three times, there will be new telescopes, new experiments, and new approaches to SETI. We hope that you will be able to participate in them too!
Data will be recorded on high-density tapes at the Arecibo telescope in Puerto Rico, filling about one 35 Gbyte DLT tape per day. Because Arecibo does not have a high bandwidth Internet connection, the data tape must go by snail-mail to Berkeley. The data is then divided into 0.25 Mbyte chunks (which we call "work-units"). These are sent from the Seti@Home server over the Internet to people around the world to analyze.
Extra Credit Section: How the data is broken up
SETI@home looks at 2.5 MHz of data, centered at 1420 MHz. This is still too broad a spectrum to send to you for analysis, so we break this spectrum space up into 256 pieces, each 10 kHz wide (more like 9766 Hz, but we'll simplify the numbers to make calculations easier to see). This is done with a software program called the "splitter". These 10 kHz pieces are now more manageable in size. To record signals up to 10 KHz you have to record the bits at 20,000 bits per second (kbps). (This is called the Nyquist frequency.) We send you about 107 seconds of this 10 kHz (20kbps) data. 100 seconds times 20,000 bits equals 2,000,000 bits, or about 0.25 megabyte given that there are 8 bits per byte. Again, we call this 0.25 megabyte chunk a "work-unit." We also send you lots of additional info about the work-unit, so the total comes out to about 340 kbytes of data.
Sending You the Data
SETI@home connects only when transferring data. This occurs only when the screen saver has finished analyzing the work-unit and wants to send back the results (and get another work-unit.) We'll only do this with your permission and allow you to control when your computer connects to us or, we'll give you the option to set the screen saver to transfer data automatically as soon as it's done with the current work-unit. The data transmission lasts for less than 5 minutes with most common modems and we'll disconnect immediately after all data is transferred.
We keep track of the work-units here in Berkeley with a large database. When the work-units are returned to us, they are merged back into the database and marked "done." Our computers look for a new work-unit for you to process and send it out, marking it "in progress" in the database. We send out each work unit multiple times in order to make sure that the data is processed correctly. Even properly operating computers can make mistakes from time to time. If you are unable to complete a work unit, or your computer crashes and you lose your result, don't worry. The data isn't lost.
What is SETI@home Looking For?
|So, what will you be doing for us? What exactly will you be looking for in the data? The easiest way to answer that question is to ask what we expect extraterrestrials to send. We expect that they would want to send us a signal in the most efficient manner for THEM that would allow US to easily detect the message. Now, it turns out that sending a message on many frequencies is not efficient. It takes lots of power. If one concentrates the power of the message into a very narrow frequency range (narrow bandwidth) the signal is easier to weed out from the background noise. This is especially important since we assume that they are far enough away that their signal will be very weak by the time it gets to us. So, we're not looking for a broadband signal (spread over many frequencies), we're looking for a very specific frequency message. The SETI@home screen saver acts like tuning your radio set to various channels, and looking at the signal strength meter. If the strength meter goes up, that gets our attention. |
Another factor that helps reject local (earth-based and satellite-based) signals is that local sources are more or less constant. They maintain their intensity over time. On the other hand, the Arecibo telescope is fixed in position. When SETI@home is in operation, the telescope does not track the stars. Because of this, the sky "drifts" past the focus of the telescope. It typically takes about 12 seconds for a target to cross the focus (or "target beam") of the dish. We therefore expect an extraterrestrial signal to get louder and then softer over a 12 second period. Since we are looking for this 12 second "gaussian" signal, we send you about 100 seconds of data. Also, we allow the data in the work-units to overlap a little so we won't miss an important signal by cutting it off early in the analysis.
|Let's look at and some examples!|
| ||This graph (typical of the others below) shows time progressing along the horizontal X-axis. The vertical Y-axis represents the frequency, or pitch, of the signal. Here you see a broadband signal. Many frequencies all mixed together. Note that the signal starts out weak (dim) at the left, and gets louder (brighter), reaching a maximum in the center of the graph 6 seconds later, fading out over the next 6 seconds. This is what we would expect from an extra-terrestrial signal as it drifts past the telescope. Unfortunately, we are not looking for broadband sources. This is probably what a star or other natural astronomical source would look like. Broadband sources are rejected.|
|This is more what we're looking for. Here you can see the signal is much narrower in frequency range. It also gets stronger and weaker over a 12 second period. We don't know how narrow the bandwidth will be, so we'll check for signals at several bandwidths.|| |
| ||If our stellar friends are trying to put actual information on their signal (very likely), the signal will almost certainly be pulsed. We'll be looking for this too.|
|Because planets rotate, both the extraterrestial transmitter and our telescope are movin in circular paths around the axis of their planet. This motion shows up as a changing relative speed during the course of the observation. Because of this, there is likely to be a "doppler shifting" or changing frequency, of the signal because of our relative motions. This might cause the signal to rise or fall in frequency slightly over the 12 seconds. These are called "chirped" signals. We'll check for this too.|| |
| ||Of course, we'll also be checking for a doppler shifting (chirped) signal that contains pulses too!|
Extra Credit Section: More detail on the analysis
The SETI@home software searches for signals about 10 times weaker than the SERENDIP IV search at Arecibo, because it makes use of a computationally intensive algorithm called "coherent integration." No one else (including the SERENDIP program) has had the computing power to implement this method. Your computer performs fast fourier transforms on the data, looking for strong signals at various combinations of frequency, bandwidth, and chirp rates. The following steps are taken on each of the work-units you get from us.
|Let's look first at the most computationally intensive portion of the calculation. The first job is to "de-chirp" the data - that is, to remove all the effects of the doppler acceleration. At the finest resolution, we have to do this a total of 20,000 times, from -10 Hz/sec to +10 Hz/sec in steps of .002 Hz/sec. At each chirp-rate, the 107 seconds of data is de-chirped and then divided into 8 blocks of 13.375 seconds each. Each 13.375 second block is then examined with a bandwidth of .07 Hz for peaks (that's 131,072 tests (frequencies) per block per chirp rate!) This is a LOT of calculation! In this first step, you computer does about 200 billion calculations! We drop a few tests between 10 Hz/sec and 50 Hz/sec.|
We're not finished, we still have to test other bandwidths too. The next step doubles the bandwidth to 0.15 Hz. We only have to examine 1/4 the number of rates due to the increase in bandwidth. We end up doing about 1/4 the amount of work we did above at the highest resolution narrow bandwidth, or about 50 billion calculations. Piece of cake...
|The next step doubles the bandwidth again (from 0.15 to 0.3 Hz) and again reduces the chirps by 1/4. This step (and all successive steps) take 1/4 the calculation of the previous step. In this case only 12.5 billion calculations. This continues for a total of 14 doublings of bandwidth (0.07, 0.15, 0.3, 0.6, 1.2, 2.5, 5, 10, 20, 40, 75, 150, 300, 600, and 1200 Hz) to bring you to a grand total of slightly more than 275 billion operations on the 107 seconds of data. As you can see we actually do most of our work at the narrowest bandwidth (about 70% of the work.) |
Finally, signals that show a strong power at some particular combination of frequency, bandwidth and chirp are subjected to a test for terrestrial interference. Only if the power rises and then falls over a 12 second period (the time it takes the telescope to pass a spot in the sky) can the signal be tentatively considered extra-terrestrial in nature. Spikes (shorts radio bursts) above a threshold value are also recorded.
Finding pulsed signals
All the above work is done just to find a continuous extraterrestrial signal - one that's "always on", or an intense pulses. What if our alien friends are trying to signal us with a series of regularly spaced pulses? That means that we must also look for a signal that regularly varies in power over time. SETI@home applies two different tests. One of these tests looks for pulse triplets that are relatively strong. The other test looks for lots of equally spaced, but weak pulses.
The triplet test is pretty simple. For every frequency slice in the spectrum, the computer looks for pulses above a certain threshold value. This threshold is set at a reasonable value so as to reveal a reasonable number of pulses, yet not overwhelm us with noise that would cause us useless calculation. For every pair of pulses above the threshold, the screensaver looks for a pulse exactly in between the two. If it finds one, the screensaver logs it and sends the data back to Berkeley. The screen saver does this intelligently, so that it does not repeat itself in trying ALL pairs. Remember that it must do this for EVERY frequency slice in your 10kHz sample!
The second pulse detection method is a little more complicated. It was developed specially by the SETI@home team just for this purpose. It's called a "fast folding algorithm." Again we do this analysis on every frequency slice in your 10kHz piece of data. This method was designed to detect lots of small repeating pulses in the data. These small pulses may be weak enough to seem lost in the noise and be undetectable. We start by selecting a frequency slice from the data set to examine. We now look at this power vs. time data for the pulses. The screensaver slices up the data into uniform sized time chunks and adds the chunks all together. If the size of the time chunk is the same as (or a multiple of) the period of the pulses, all the pulses will add one on top of the other and we will see the pulses grow out of the noise. The hard part here is that we must guess the correct size for the time slice. Since we have no idea what the frequency of the pulses might be, we have to try ALL the various time periods. Again, the algorithm does this in such a way that it will not repeat work already done. If repeating pulses are found, they are logged and sent back to Berkeley. How long should all these computations take? An average, current model home home computer should take between 10 and 50 hours to complete one work-unit. This assumes that the computer ONLY works on SETI@home.
Depending on how the telescope was moving when the work unit was recorded, your computer will do between 2.4 trillion and 3.8 trillion mathematical operations (flops or floating point operations in technical jargon) to complete its work.
Now you know why we need your help!
What If My Computer Discovers E.T.? What happens?
Before we can get to the "what happens" part, we should let you know more about the "what if" part. One of the most important things to know about this data and the results of your analysis is that there are LOTS of sources of radio signals. Many are produced here on earth. TV stations, radar, various other microwave transmitters. Satellites and many astronmical objects are also sources. There are also "test signals" that are intentionally injected into the system so the SETI@home team can confirm that the hardware and software is working properly at all points through the system. The Arecibo radio telescope will pick up all these signals and happily send them along to your screen saver. The radio telescope doesn't care about any of these signals just as your ear doesn't care about what sounds it collects. Your screen saver is going to sift through the signals looking for any source that is "louder" that the background and also rises and falls in 12 seconds - the time the telescope takes to pass over a spot in the sky.
Any signals that qualify will be sent back to the Berkeley SETI@home team for further analysis. The SETI@home team maintains a large database of known radio-frequency interference (RFI) sources. This database is constantly updated. At this point 99.9999% of all the signals that your screen saver detects will be thrown out as RFI. Test signals are also removed at this point.
Remaining unresolved signals are then checked against another observation from the same part of the sky. This could take up to 6 months since the SETI@home team does not have control of the telescope. If the signal is confirmed, the SETI@home team will request dedicated telescope time and will re-observe the most interesting candidates.
If a signal is observed two or more times, and it's not RFI or a test signal, the SETI@home team will ask another group to take a look. This other group will be using different telescopes, receivers, computers, etc. This will hopefully rule out a bug in our equipment or our computer code (or a clever student playing a prank...) Together with the other team, SETI@home will do interferometry measurements (it takes two observations seperated by a big distance) This can confirm that the source of the signal is at interstellar distances.
If this is confirmed, SETI@home will make an announcement in the form of an IAU (International Astronomical Union) telegram. This is a standard way of informing the astronomical community of important discoveries. The telegram contains all of the important information (frequencies, bandwidth, location in the sky, etc.) that would be necessary for other astronomical groups to confirm the observation. The person(s) who found the signal with their screen saver would be named as one of the co-discoverers along with the others on the SETI@home team. At this point we would still be unsure if the signal was generated by an intelligent civilization or maybe some new astronomical phenomenon.
All information about the discovery will be made public, probably via the web. No country or individual would be allowed to jam the frequency the signal is observed on. Since the object will rise and set as seen from any given location, observations from radio observatories around the world will be necessary. This will, by necessity, be a multi-national effort. All this information will be made public.
Because of this protocol, it is important that participants in the SETI@home project do not get excited when they see signals on their screen and go off on their own making announcements and calling the press. This could be very damaging to the project. It's important that we keep our heads cool and our computers hot while they grind away at the data. We can all hope that we will be the one that helps receive the signal of some extraterrestrial civilization trying to "phone home."
More information about the SETI@home project can be found on the SETI@home website; http://setiathome.berkeley.edu/