Until audio compression was invented, high-quality digital audio data took a lot of hard disk space to store. Let's go through a short example. Suppose you want to sample your favorite 1-minute song and store it on your hard disk. Because you want CD quality, you sample at 44.1 kHz, stereo, with 16 bits per sample.

44100 Hz means that you have 44,100 values per second coming in from your CD-ROM (or input file). Multiply that by two, because you have two channels, left and right. Multiply by two again because you have two bytes per sample (i.e. 16 bits per sample). The song will take up 44100 samples/s x 2 channels x 2 bytes/sample x 60 seconds/min Mbytes equals about 10 megabytes of storage space on your hard disk, for each minute!

If you want to download that over the Internet, given an average 28.8 modem, it would take you about 10,000,000 bytes x 8 bits/byte / (28800 bits/s) x (60 s/min) or about 45 minutes. Just to download one minute of music!

Digital audio coding (also called "digital audio compression") is the art of minimizing storage space (or channel bandwidth) requirements for audio data. Modern perceptual audio coding techniques (like MPEG, Vorbis OGG, AAC) consider how we actually perceive sound, to give higher fidelity to the parts of the sound that we hear most clearly, and to simplify the parts of the sound where we are less likely to notice any difference. Using compression that uses these techniques can take uncompressed audio, and represent it in about a twelfth the original disk space, without any noticeable loss in fidelity. It can compress it even further and further, if you are willing to put up with less and less fidelity.

So such schemes are the main technology for high quality low bit-rate applications, like soundtracks for CD-ROM games, solid-state sound memories, Internet audio, digital audio broadcasting systems, and the like.

When you deal with compression technology, often you have to specify the "bitrate". The "bitrate" refers to how much data is needed to represent how much sound. If less data represents each second of sound, then the sound quality will be lower; but if more data represents each second, the quality will be better.

For example, an MP3 encoded "at a bitrate of 128 Kbps" (a very common compression rate) represents each second of sound using 128,000 bits -- roughly like the quality of FM radio with good reception. The same sound encoded "at a bitrate of 64 Kbps" will represent each second using half as much data, so the file on the whole will be half the size of the 128 Kbps and the audio quality will be lower. It's not quite right to say that it's "half as good", since it's hard to really use numbers for how good we think something sounds; but 64 Kbps could be compared to the quality of a copied cassette tape.