by Dave Phillips
Sound is all around you. You are literally immersed in it; you live in a multidimensional 360-degree audio field. That amazing combination of ear and mind enables you to separate and identify sounds in that field with great discrimination and precision. You can hear sounds in a frequency range between 20 and 20,000 cycles per second, and you can discern those frequencies at loudness levels from a whisper to the roar of a jet engine. You can tell the difference between an oboe and a guitar, and you can identify the same note played by either instrument.
Musicians have traditionally referred to a sound's frequency (pitch), amplitude (loudness), and timbre (tone color, the characteristic sound of an instrument). The traditional context of those distinctions has been called "musical" sound, but even in the domain of sounds often classed as "noise," we are capable of keen discernment. Yet there is an essential element missing from this frequency/amplitude/timbre model of sound. That element is the very space itself, the environment in which the sound occurs.
Figure 1. Monophonic listening.
The technology of audio recording and playback began with monophonic (mono) systems (see Figure 1). Mono recording captures sounds mixed into a single audio channel, and mono playback systems render a very limited single-channel representation of sound in an acoustic space. The advent of stereo recording and playback was the first step in more accurately recreating the real audio field for listeners.
Stereo recording/playback mixes and separates sound sources into two channels, expanding the audio field and creating a more realistic placement of sounds in that field (see Figure 2). Stereo playback locates sonic events in a relatively flat field between left and right speakers, and audio trajectories and locations are plotted along the x and y axes of a two-dimensional acoustic space. Combining a sound's intensity with its x/y coordinates creates a more realistic sense of its location in the auditory field conjured by your stereo speakers. However, as noted earlier, the real auditory field is a 360-degree space, with sound not only in your front left and right areas, but above, below, and behind you as well. Common stereo lacks fully realistic acoustic presence. The height, depth, trajectory, and velocity of sound in the 360-degree auditory space is represented in stereo either poorly (by using another pair of stereo speakers) or not all. The use of a subwoofer helps dissociate bass frequencies from a fixed location, but the movement of the sound is still relatively static.
Figure 2. Stereophonic listening.
The stereo space can be widened by a process called 3D stereo enhancement (3DSE), an effect found on the SoundBlaster 16 and AWE32/64 soundcards. 3DSE is a signal-processing effect designed to create the illusion of an extended stereo space. It is not true 3D audio, but it can have a striking effect on common stereo systems.
Quadraphonic and surround-sound systems support four or more channels, requiring playback systems of four or more speakers (see Figure 3). These technologies would seem to resolve the issue of creating realistic sound spaces, but they have disadvantages even compared to stereo. The actual physical systems are difficult to set up, and multi-speaker systems are very sensitive to the listener's head movement (the aural illusion is destroyed when you move your head towards one or another of the speakers).
Figure 3. Quadraphonic listening.
For casual listeners, stereo works just fine, and stereo is the common audio spatialization for media as diverse as broadcast radio and television, VHS videotape, MP3 audio files, CDs and DVDs, and commercial cassette tape. However, lovers of higher quality audio and players of sophisticated contemporary games are no longer satisified with stereo desktop digital sound. With the advent of soundcards such as Aureal's Vortex2 and the Creative Labs SBLive, the dual-speaker stereo paradigm became obsolete. These and other new cards provide independent outputs and controls for front and rear speaker systems, creating a far more realistic sense of the 360-degree audio field, and new applications programming interfaces (APIs) provide ready access to features for coders who wish to exploit the new possibilites of 3D sound in games and other audio applications.
3D audio (or, to use its proper name, "positional 3D audio") is the latest technology designed to render sound in the entire audio field. It extends the four-speaker model of quad and surround sound, and it attempts to resolve the issues of cross-talk and phase cancellation that weakens those systems. A description of the means by which 3D audio works its magic is beyond the scope of this article (it involves such arcana as inter-aural difference and the head-related transfer function), and I refer the interested reader to the excellent article by Monkfish for more information regarding the technical details of positional 3D audio.
Positional 3D audio plots a sound's velocity and trajectory in full three-dimensional space. Effects can be achieved with 3D audio that are difficult or impossible with simple stereo output. Panning is no longer restricted to moving from side to side in a two-speaker system, and sounds can now go up, down, or literally around anywhere in the three-dimensional auditory space. With the appropriate soundcard and sound system, 3D audio presents some very exciting sonic possibilities. A game could include aural cues to alert you to the presence of an invisible opponent coming from your left rear corner, a composer might experiment with a sound spiralling towards the listener in a slowly-tightening loop, or the movement of images in an animation may be coordinated with sounds in unusual and novel ways such as a reverse Doppler effect with the sound decreasing in intensity and frequency as it nears the listener, reversing the effect as it leaves.
Until recently the only 3D audio APIs have been proprietary closed-source solutions such as QSound and Microsoft's DirectSound3D, or hardware-specific interfaces such as Aureal's A3D and Creative Lab's EAX. Unfortunately for Linux users, Aureal's proprietary Linux drivers for the 8810/20/30 series do not support A3D, and at this time the company has filed for bankruptcy. Creative Labs has generously opened their driver code for the SBLive and that code is now covered by the GPL; however, their EAX 3D audio interface is a chipset-dependent API. Clearly, there is a need for a more general way to access and provide 3D audio services to applications using these and other cards. The programmers behind the OpenAL project plan to satisfy that need.
Discuss this article in the O'Reilly Network Linux Forum.
Return to the Linux DevCenter.