Perception Is Reality: Psychoacoustics From An Audio Engineer’s Perspective

Outer Ear
Acoustic waves enter via the ear canal, which is effectively a tube resonator. This reverb is so short that it is perceived as EQ, adding a hefty bump (up to 20 dB) in our hearing response in the 2 kHz to 6 kHz range.

As sound travels toward the end of the auditory canal, it excites the tympanic membrane (eardrum). This naturally occurring transducer converts acoustical pressure into mechanical energy. The Eustachian Tube acts as a port to prevent backpressure from building up behind the eardrum.

At this point, we also find our first stage of compression/limiting – the tensor tympani, a muscle connected to the tympanic membrane, dampens the transducer during high-level vibrations. This primary stage of compression makes up the first part of what audiologists refer to as the acoustic reflex, which we’ll jump back into a little later.

Middle Ear
As sound waves (now in the form of mechanical pressure) exit the eardrum, they pass through the malleus and incus, also known as the hammer and anvil. The primary purpose of these small bones, referred to as the ossicles, is to convert mechanical energy into pressure variations in the cochlear fluid. This is a difficult task, to be sure, since we know that liquids have a much higher input impedance than air.

To accomplish the necessary impedance matching, the ossicles act as a complex series of levers to convert low-pressure variations across a wide area (eardrum) into high-pressure variations across a small area at the stapes, where the ossicles connect to the cochlea. The result is roughly 30 dB of gain compensation, ensuring that sound is delivered to the inner ear at a useable level.

The second stage of compression/limiting in our “acoustic reflex system” consists of the stapedius, a small muscle that stabilizes the ossicles during high SPL movements. Due to the stiffening action of the stapedius, which only limits larger (lower frequency) displacements, this limiting is only effective for frequencies below 2 kHz.

Cross-section of the cochlea revealing the basilar membrane and organ of corti. (Credit: michaelsoud.wikispaces.com)

It’s also important to note that this second stage of limiting is triggered involuntarily, while the first (tensor tympanic) is voluntary. The “threshold” for these two compressors can be anywhere between 70 and 105 dB SPL, and the “attack” or reaction time can range between 10 and 100 ms. Together, the acoustic reflex system is capable of withstanding SPL of up to 140 dB – the equivalent of a compression ratio of 100 trillion to one.

Inner Ear
At the end of the ossicular chain we find the stapes, often referred to as the “footplate” or “stirrup” of the cochlea, which acts as a piston driving the fluid inside its two outer chambers back and forth. Sound waves travel down the length of the upper chamber toward its apex, then turn around and travel back down the lower chamber toward the base. The vibration transfers energy to the fluid-filled scala media (middle chamber), which actually contains the “A/D converters” of the signal chain.

Resting along the floor of this chamber is the organ of corti, containing roughly 25,000 hair cells, as well as the tectorial membrane, which covers the hair cells like a flap. The tectorial membrane is specifically “tuned” to resonate at different frequencies down its length – wide and flexible at the apex for lower frequency response, and narrow and stiff at the base for higher frequencies. The hair cells also vary in size and rigidity according to the frequencies they are dedicated to.

This is where things start to get crazy. As vibrations in the fluid set the tectorial membrane into motion, the outer rows of hair cells (usually three) respond to the vibrations and transmit the information to the auditory nerves.

Organ of corti revealing tectorial membrane and hair cells. (Credit: ssc.education.ed.ac.uk)

However, the inner row of hair cells has an entirely different role: they reach up and dampen the tectorial membrane during high levels. Because the hair cells are divided up into 32 frequency-specific bands, the inner cells actually function as a 32-band compressor (our third stage) to protect the outer cells during high levels in a specific range.