
Internet Audio Technologies
As we learned, several companies have developed methods of reducing
the size of a 16-bit 44.1 kHz audio file into a more manageable
size for internet distribution. Methods of reducing audio file sizes
are known as a codec, which is short for compression/decompression.
Remember that WAVs and AIFFs are uncompressed audio.
Applying a codec to an uncompressed audio file will yield a compressed
file that is smaller in size, yet (hopefully) maintains the sonic
integrity of the original file. You might be familiar with WinZip
or Stuff It. These programs compress computer data into smaller
files that can be emailed or distributed in less time over the internet.
Codecs determine what information is unnecessary and throws it
away. As a result, the file size is smaller. We learned that stereo
audio is about 10.5 megs per minute.
Mono audio files will be half that size of stereo audio files since
stereo is actually a combination of two mono files! A codec works
in this way, but it does its magic by reducing bit resolution rates
(16 to 8 to 4 bits) and reducing sample rates (44.1 k to 32 k to
22 k to 11 k).
Bit resolution is an important component for the fidelity of an
audio file. Each reduction in bit resolution results in a less accurate
description of the amplitude of each sample. For example, if I asked
you to measure a wall using only full sheets of 8.5” x 11”
paper, you would be able to give me a number (say 10 sheets) that
will represent the height of the wall.
When you get to that last sheet of paper, you might find that the
wall is actually 9.5 sheets high, but the criteria is to describe
the height using whole sheets of paper, so you opt for saying 10
sheets. This is equivalent to 8-bit resolution. Now remeasure that
wall with index cards. You will find that you can get much closer
to describing the actual height of the wall because your measuring
unit is smaller. This is equivalent to 16-bit resolution.
Sample rate reduction affects the frequency response of your audio
file. Remember Nyquist? The sample rate needs to be twice the highest
frequency you plan to encode, and 44.1 kHz is the standard for CD
Quality audio. This means that the upper limit on the high end is
22.05 kHz, which is beyond what most people can hear.
32 k will give you a high end limit of 16 k, which is just below
what the average person can hear (of course, we lose high end response
abilities as we age). This sort of reduction in high end is almost
undetectable to the average listener. A 22 k sampling rate will
limit the high end to about
11 k. Cymbals on a drum set live in the 10 k range, so you can
see that we are still at an acceptable frequency response (maybe
slightly dull), but this will be perceived as good quality by the
majority of listeners. Also notice that the sampling frequency is
now half of it’s original 44.1 - therefore, the file size
is also half as large. Each reduction of these parameters yields
a smaller file size but at the cost of fidelity. The race in this
field is to provide a small file size with excellent audio quality,
which is no small task, indeed.
There are two types of delivery modes for the internet: Download
and Streaming. Every platform can be “downloaded” -
you can post or send a WAV or AIFF to anyone via e.mail. Of course,
the result of downloading a WAV or AIFF is massive connect times
on the internet because the files are so big, so the person you
send such a file to may not be too happy about it but it can be
done.
Some genius somewhere realized that they would be donned the King/Queen
of internet Delivery if audio files could be reduced in size yet
perfect audio quality was left intact. The most common form of downloadable
audio delivery is mp3. This codec analyzes audio information and
translate it in a compression scheme of 5:1, with almost no detectable
loss of fidelity.
This means, for instance, that a 3 minute music sample that was
originally 33 megs could become 6.3 megs or smaller. This is accomplished
through the use of variable sample rates, variable bit rates, and
perceptual coding. To explain perceptual coding, let’s look
at a typical song; the music begins the vocals come in and possibly
the music continues by itself at the end of the piece.
When the voice comes in, the music drops down in level and is at
times masked by the voice itself. Codecs analyze these waveforms
and give the most bits to the voice (which is up front) and less
bits to the music (in the background). There is no need to encode
the music in full fidelity since it is covered by the voice most
of the time.
Streaming media is the ability to see or hear content on demand
from a web site. This is a hot field in the internet world! The
main players of streaming technology are Apple’s QuickTime,
Real Network’s Real Media, and Microsoft’s Media Player.
These three companies have led the march to provide high quality
media streams at the lowest bit rate possible.
At one time, each company’s player would only play their own
files, but these days most players able to decode all the other
formats. Isn’t direct competition grand?
This article was made possible by Spot Taxi. www.spottaxi.com
uses the internet to traffic radio commercials using mp2 technology.r.
|