Unix Pseudorandom Functions
The standard Unix C library provides two random number generators:
rand( ) and
random( ). A third random number generator,
drand48( ), is
available on some versions of Unix. Although you
won't want to use any of these routines to produce
cryptographic random numbers, we'll briefly explain
each. Then, if you need to use one of them for something else,
you'll know something about its strengths and
The original Unix random number generator,
rand( ), is not a very good random number generator. It uses a 32-bit seed and maintains a 32bit internal state. The output of the
function is also 32 bits in length, making it a simple matter to
determine the function's internal state by examining
the output. As a result,
rand( ) is not very
random. Furthermore, the low-order bits of some implementations are
not random at all, but flip back and forth between 0 and 1 according
to a regular pattern. The
rand( ) random number
generator is seeded with the function
On some versions of Unix, a third function is provided,
rand_r( ), for multithreaded applications. (The
rand( ) itself is not safe for
multithreading, as it maintains internal state.)
Do not use
rand( ), even for "trivial" programs.
random( ) function is a more sophisticated
random number generator that uses nonlinear feedback and an internal
table that is 124 bytes (992 bits) long. The function returns random
values that are 32 bits in length. All of the bits generated by
random( ) are usable.
random( ) function is adequate for
simulations and games, but should not be used for security-related
applications such as picking cryptographic keys or simulating
drand48( ), lrand48( ), and mrand48( )
drand48( ) function is one of many functions
that make up the System V random number generator. According to the
Solaris documentation, the algorithm uses "the
well-known linear congruential algorithm and 48-bit integer
arithmetic." The function
) returns a double-precision number that is greater than
or equal to 0.0 and less than 1.0, while the
mrand48( ) functions return
random numbers within a specified integer range. As with
random( ), these functions provide excellent
random numbers for simulations and games, but should not be used for
security-related applications such as picking cryptographic keys or
simulating one-time pads; linear congruential algorithms are too easy
Picking a Random Seed
Using a good random number generator is easy. Picking a random seed, on the other hand, can be quite difficult. Conceptually, picking a random number should be easy: pick something that is always different. But in practice, picking a random number--especially one that will be used as the basis of a cryptographic key--is quite difficult. The practice is difficult because many things that change all the time actually change in predictable ways.
A stunning example of a poorly chosen seed for a random number generator was revealed on the front page of the New York Times in September 1995. The problem was in Netscape Navigator, a popular program for browsing the World Wide Web. Instead of using truly random information for seeding the random number generator, Netscape's programmers used a combination of the current time of day, the process ID (PID) of the running Netscape program, and the parent process ID (PPID). Researchers at the University of California at Berkeley discovered that they could, through a process of trial and error, discover the numbers that any copy of Netscape was using and crack the encrypted messages with relative ease.
Another example of a badly chosen seed generation routine was used in Kerberos Version 4. This routine was based on the time of day XORed with other information. The XOR effectively masked out the other information and resulted in a seed of only 20 bits of unpredictable value. This reduced the key space from more than 72 quadrillion possible keys to slightly more than 1 million, thus allowing keys to be guessed in a matter of seconds. When this weakness was discovered at Purdue's COAST Laboratory, conversations with personnel at MIT revealed that they had known for years that this problem existed, but the patch had somehow never been released.
In the book Network Security, Private Communication in a Public World, Kaufman et al. identify three typical mistakes when picking random-number seeds:
- Seeding a random number generator from a limited space
If you seed your random number generator with an 8-bit number, your generator has only one of 256 possible initial seeds. You will have only 256 possible sequences of random numbers coming from the function (even if your generator has 128 bytes of internal state).
- Using a hash value of only the current time as a random seed
This practice was the problem with the Netscape security bug. The problem was that even though the Unix operating system API appears to return the current time to the nearest microsecond, most operating systems have a resolution considerably coarser--usually within one 1/60th of a second or less. As Kaufman et al. point out, if a clock has only 1/60th of a second granularity, and the intruder knows to the nearest hour when the current time was sampled, then there are only 60 × 60 × 60 = 216,000 possible values for the supposedly random seed.
- Divulging the seed value itself
In one case reported by Charlie Kaufman et al., and originally discovered by Jeff Schiller of MIT, a program used the time of day to choose a per-message encryption key. The problem in this case was that the application included the time that the message was generated in its unencrypted header of the message.
How do you pick a good random send? Here are some ideas:
Use a genuine source of randomness, such as a radioactive source, static on the FM dial, thermal noise, or something similar. Measuring the timing of hard disk drives can be another source of randomness, provided that you can access the hardware at a sufficiently low level.
Ask the user to type a set of text, and sample the time between keystrokes. If you get the same amount of time between two keystrokes, throw out the second value; the user is probably holding down a key, and the key is repeating. (This technique is used by PGP as a source of randomness for its random number generator.)
Monitor the user. Each time the user presses a key, take the time between the current keypress and the last keypress, add it to the current random number seed, and hash the result with a cryptographic hash function. You can also use mouse movements to add still more randomness.
Monitor the computer. Use readily available, constantly changing information, such as the number of virtual memory pages that have been paged in, the status of the network, and so forth. This is how /dev/random works.
RFC 1705, by Donald Eastlake, Steve Crocker, and Jeffrey Schiller, makes many observations about picking seeds for random number generators. Among them are the following:
Avoid relying on the system clock. Many system clocks are surprisingly non-random. Many clocks that claim to provide accuracy actually don't, or they don't provide good accuracy all the time.
Don't use Ethernet addresses or hardware serial numbers. Such numbers are usually "heavily structured" and have "heavily structured subfields." As a result, attackers could easily try all of the possible combinations, or guess the value based on the date of manufacture.
Beware of using information such as the time of the arrival of network packets. Such external sources of randomness could be manipulated by an adversary.
Don't use random selections from a large database (such as a CD-ROM) as a source of randomness. The reason, according to RFC 1750, is that your adversary may have access to the same database. The database may also contain unnoticed structure.
Consider using analog input devices already present on your system. For example, RFC 1750 suggests using the /dev/audio device present on some Unix workstations as a source of random numbers. The stream is further compressed to remove systematic skew. For example:
$ cat /dev/audio | compress - >random-bit-stream
RFC 1750 further advises that the microphone not be connected to the audio input jack for fear that the /dev/audio device will pick up random electrical noise. This rule may not be true on all hardware platforms. You should check your hardware with the microphone turned on and with no microphone connected to see which way gives a "better" source of random numbers. If you decide to use it without a microphone connected, you should then label the jack so that somebody does not accidentally plug a microphone into it.