Measurement

Tuesday, March 13, 2012

Extremes: Utilizing DSP On The “Low Lows” And “High Highs” Of Loudspeakers

The informed usage of DSP to optimize loudspeakers should be high on the list of tools that can improve your craft

Be sure to read the first two parts of this series:

Making It Flat: Analyzing Loudspeakers & DSP
Harnessing The Power Of Digital Signal Processing

————————————————-

Years ago, a couple of smart guys named Thiele and Small characterized the behavior of cone drivers.

One of their many excellent contributions was identifying how an electrical filter (passive or active) can be combined with the electro-mechanical-acoustical parameters of a cone driver, in a properly ported enclosure, to extend and optimize low-frequency output capability.  In their work they refer to this as an “alignment.”

Here we’ll show you how to create a sixth-order alignment that is similar to the original Thiele-Small B6 alignment, but with an improvement that is now possible due to the extended capabilities of modern digital signal processing – specifically the ease with which a well-designed DSP can provide high-order filter shapes.

At the root of this is that most cone driver/enclosure combinations can be “tricked” into decreasing the frequency of their LF –3 dB cutoff point (read: minus 3 dB) by about a third of an octave, with very little loss in total output capability. We say “tricked” but that’s not really true. There’s solid math behind this process.

Before proceeding, it’s important to note that the -3 dB points in the LF and HF extremes are the de facto industry standard that determines the stated frequency response of a given system (i.e., 55 Hz – 17 kHz ± 3 dB). In Thiele-Small language, the -3 dB cutoff points are referred to as F3. Be careful though when comparing manufacturer spec sheets! Some footnote a wider tolerance, such as ± 4 dB (and even greater) in order to make their specs look better.

The devices utilized to acquire the data we’ll be using and showing in this discussion include a Community Professional iBOX-1294 loudspeaker (12-inch cone driver and 1.4-inch compression driver) in a bi-amplified configuration, an XTA DP548 digital processor, a Bruel & Kjaer 4007 microphone, a Jensen Twin-Servo mic preamp, and a Hewlett-Packard 35665A dynamic signal analyzer (Figure 1, above left).

First Things First

Let’s quickly look at the differences in sonic quality between a gradual LF roll-off (very typical of the systems of yesteryear) and a sharper “square-shouldered” roll-off. Many older ported loudspeaker systems exhibit a long, slow LF roll-off curve due the cone driver’s characteristics, the porting method, and the box volume (the latter is known as Vb in Thiele-Small vernacular). Sonically, the result is often a lack of a clearly defined spectrum in which the loudspeaker is intended to perform.

Put enough units of this type of loudspeaker together, and it might achieve scintillating bass from the LF acoustical addition, even well below the F3 point. However, this only occurs when there are many units present. With only one or just a few units, the LF response is likely to be muddy, and will overlap – not necessarily in a positive manner – with any subwoofers that might be present in the system (Figure 2).

Figure 2: The top trace depicts the gradual acoustical LF roll-off of an un-equalized cone driver in a moderate-sized, ported enclosure, while the bottom trace shows the direct output of the XTA DP548, verifying that no filters have yet been introduced. The horizontal scale of each grid is 0 Hz to 200 Hz, while the vertical scale is 6 dB per vertical division. Though the F3 point is 52 Hz, significant energy of about -27 dB is still present at 26 Hz, an octave lower than the F3 point. (click to enlarge)

An alternate approach to merely building and porting an enclosure is to critically tune the enclosure in conjunction with a second-order (or higher) filter, thereby creating a B6 alignment. The EQ filter shape will look like it’s producing an LF bump in the response (lower trace in Figure 3), but once it’s properly adjusted to align with the response of the driver, the enclosure, and the port tuning, the measured acoustic response (upper trace in Figure 3) reveals extended bass that’s nominally flat. 

Moreover, the LF response will fall within a more tightly defined range than the un-equalized response shown in Figure 2. The rapid roll-off below the new F3 frequency of 43 Hz results in the aforementioned “square shouldered” response curve.

Figure 3: The top trace depicts the acoustic response of a nominally flat B6 aligned 12-inch LF driver in a ported enclosure. The lower trace shows the +6 dB peaking filter integrated with a 36 dB/octave high-pass filter generated by the DP548. The two filters interact, producing >42 dB attenuation an octave below the peaking filter’s center frequency. Note: the trace scales are the same as Figure 2. (click to enlarge)

Hearing The Difference

Does 10 Hz of extended LF response make an audible difference? Yes it does. At these low frequencies, gaining another 10 Hz of response is no small thing. The sonic difference between an F3 of 52 Hz versus an F3 of 43 Hz is approximately equal to missing the lowest string on a 4-string bass guitar.

To create the filter shape as seen in the lower trace of Figure 3, you’ll need to combine a peaking filter that’s immediately followed with a high-pass filter below the peaking frequency.

However, it’s important to note that port tuning is also a critical component for optimizing the end result. This is one of several good reasons for using PVC ports if you’re building enclosures yourself.

The port depth can easily be adjusted merely by cutting different lengths of PVC tubing. (Do this before committing to gluing the tubes into the enclosure’s baffle board.)

Not all ported loudspeakers react well to a B6 alignment, depending on the interaction of the cone driver’s parameters, the enclosure volume, and the port area and port length.

In all cases, you’ll want to measure the actual acoustic output of the system with a high-resolution FFT analyzer –  a 1/3-octave analyzer cannot provide the resolution required for this work.

And though B6 alignments can be predicted by means of various software packages, predictions almost never agree with measured results.  Elements such as enclosure shape, driver manufacturing tolerances, barometric pressure, and humidity all can affect the outcome. Even the amount and type of insulation in the enclosure can cause an iso-thermal transfer function that gives the enclosure more apparent interior volume than it actually has (more insulation = more apparent volume). Therefore, actual measurement is the only way to truly know what’s happening.

A small number of DSP devices, such as the recently introduced Community dSPEC, offer “under-damped” filters. Properly adjusted, an under-damped filter can perform both functions: that of providing the LF peak and the high-pass roll-off. But if your DSP doesn’t offer an under-damped filter, then you’ll need to use a parametric equalizer (PEQ) to generate the peak, and a high-pass filter immediately below the peak. The high-pass filter should be at least 12 dB/octave, but 24 dB/octave (technically a B8 alignment) or a higher order is recommended. 

Because the high-pass filter will have a strong effect on the actual (and measured) amplitude of the peaking filter, you’ll find that the peaking filter may have to be set to +12 dB, or even to the full extent of its range (which is usually +15 dB) in order to produce the desired +6 dB peak that forms the basis of the B6 alignment. In fact, the only reason not to use a higher order high-pass filter, such as 36 dB/octave or even 48 dB/octave, is that the peaking filter probably can’t provide enough amplitude to compensate for the effect of the adjacent high-pass filter whose upper “skirt” will be overlapping the lower skirt of the peaking filter.

Two peaking filters of the same center frequency and Q can be “ganged together” to provide greater overall amplitude, but you’ll need to be careful not to exceed internal DSP limitations, which can result in distortion or other undesirable behavior.

In most loudspeakers, a B6 (or higher order) alignment usually can be implemented with stable driver behavior and improved sonic quality. However, you’ll want to thoroughly test the results under low levels, moderate levels, and very high levels to be sure you’re obtaining a net improvement. If the filters are in the right place, the result will always reflect a lower F3 cutoff point, and usually a marked improvement in LF definition. The newly aligned loudspeaker will give you a well-defined bandwidth, making it easier to integrate it into a multi-zone system, or into an array formation.

Filters & Dynamics

While protective limiters are an important element in avoiding both catastrophic failure and enhancing longevity of loudspeakers, the first line of defense should always be the use of low-pass and high-pass filters. Using a B6 (or higher) alignment inherently applies a sharp high-pass filter. While the high-pass may only be second-order (12 dB/octave), because the high-pass is stacked on the low side of the adjacent peaking filter, it will measure at a rate of about 18 dB/octave. If you use a higher order high-pass filter, the rejection of low frequency energy to the driver will be even greater.

The net effect is that the cone driver will be working well within its range of efficient reproduction, rather than trying to reproduce frequencies that are lower than practical. This also means that the amplifier is not wasting power trying to reproduce LF information that’s not going to transfer efficiently – or even be perceptible - into acoustic energy. Everyone wins.

An additional protection possibility is to assign a “dynamic protective limiter” to the B6 alignment LF peaking filter. If your DSP unit permits this (and the well-designed XTA DP548 indeed does), then, as the driver begins to approach its excursion limit, the protective limiter will reduce the amplitude of the B6 peaking filter.

In actual practice, it’s hard to determine if this form of driver protection is engaged or not. When the threshold is properly set, the cone driver will be on the verge of becoming non-linear anyway, so given the lesser of two evils, a soft reduction in the peaking filter amplitude will provide a very subtle means of keeping the driver healthy, while not being obvious to the listener.

Dynamic Amplitude Control

In addition to utilizing the high-pass and low-pass filters that we’ve discussed above, most modern DSP devices can be adjusted in fine increments to provide one (or more) types of amplitude compression and/or limiting, thereby providing the user with an additional means of protecting their loudspeakers from abusive conditions.

While on paper this looks promising, the truth is that it’s extremely difficult to set the thresholds, time constants (attack and release), and the gain-reduction ratios.

You can try to relate an amplifier’s maximum output voltage to your driver’s power handling capability based solely on numbers; however, this involves numerous complexities that go far beyond the scope of this article.

Given the many topologies of today’s modern power amplifiers (i.e., Class A, Class A/B, Class H, Class D, Hybrid Class D, and so on), it would be challenging indeed to relate the various time constants that a specific manufacturer used to state the measured power output of a given amplifier – to the stated power handling of a given driver.

In short, today’s amplifiers may behave very, very differently from one another when their burst power, short-term power, mid-term power, and steady-state long-term power capabilities are carefully measured.

A few loudspeaker manufacturers have conscientiously addressed these issues by developing their protection schemes through extensive empirical work, such as destruction testing of many drivers. In the final analysis, you’ve got to crack a few eggs (at least!), in order to arrive at protection parameters that are neither too loose nor too tight, under the wide range of usages that a general purpose system might encounter.

The simplest advice is to use DSP limiter functions to keep amplifiers out of clipping (at least most of the time), and avoid any obvious transition points in the applied power level that will cause the system to bottom out, become obviously distorted, or otherwise generate unpleasant audible artifacts. 

For long-term driver protection, it can be useful to simply set a long-term limiter (if your DSP be so equipped), to an arbitrary -3 dB gain reduction when driven to a high percentage of its stated maximum level for a long enough period of time (>30 minutes or longer is a good rule of thumb). While grossly imprecise when compared to the methodical destructive testing of a batch of drivers, it’s at least a starting point. 

Moving To The High End

At the opposite end of the spectrum – high frequencies – it makes nothing but good sense to look at the HF response of the horn and driver combination on a high-resolution spectrum analyzer, and then sharply roll-off the energy that’s feeding the compression driver as it enters its break-up mode.

Here, a 24 dB/octave (or higher) low-pass filter will be a good choice. While the filter will introduce some phase shift in the upper end of the audible spectrum, the driver has already deviated from a flat phase response as it transitions from its pistonic mode to its breakup mode. (For more about this, see Harnessing The Power Of Digital Signal Processing.)

Implementing a high-pass filter just before HF driver break-up actually reduces HF distortion. This is because the energy that’s emitting from the driver above its breakup frequency consists mainly of harmonic distortion – and worse – non-harmonically related distortion (almost like bees buzzing), neither of which is desirable for accurate sonic reproduction.

When measuring the phase shift caused by the 24 dB/octave (or higher) low-pass filter, most likely set in the 15 kHz to 18 kHz range, you might well be concerned that the phase shift introduced by the filter could cause less than optimal results.

There are two things to consider here:

1) The phase shift that would be introduced by the breakup mode of the driver will almost always be far worse.

2) The effect of the phase shift can be decreased, possibly to a very small level, by simply employing the adjustable phase filters and/or all-pass filters on the DSP.

Is the result of using all-pass and phase filters audible? Yes, in quiet, controlled conditions while listening to acoustic instruments, albeit subtlety.

In large-scale sound reinforcement conditions with very loud rock or electronic music, improvements will most likely be inaudible by engineers and patrons alike.

Phasing Out Anomalies

Let’s take a closer look at how equalizing a frequency response anomaly can actually improve the phase response, in addition to fixing the amplitude response. 

In Figure 4 we see how the loudspeaker’s amplitude-versus-frequency response (upper trace) is largely flat from 43 Hz to 400 Hz. However, the loudspeaker exhibits one prominent dip centered at 175 Hz, followed by a reciprocal peak that starts at 210 Hz and doesn’t settle out until about 290 Hz. This is very typical of 12-inch cone drivers in moderate-sized enclosures.

Figure 4: The top trace depicts the acoustic response of the cone driver showing a -4 dB dip at 175 Hz followed by a broader +4 dB boost centered at about 210 Hz. The lower trace shows the corresponding phase response. The horizontal scale ranges from 0 Hz to 400 Hz, while the vertical is scaled at 6 dB per major division. (click to enlarge)

It’s also an important part of the frequency spectrum that covers baritone vocals and the fundamentals of many instruments. Notice how the disruption in the phase versus frequency response (lower trace) perfectly mirrors the anomalies in the amplitude versus frequency response (upper trace).

By the way, note that the numerous phase “wraps” that occur at about 60 Hz in the LF section and dissolve into something akin to “phase noise floor” as the frequency decreases are the result of the phase-inversion that occurs in any ported enclosure. This is not to be feared! Ported enclosures can sound very, very good when ported properly and “steered” with wise choices on the front-end signal processing. It’s hard to imagine a non-ported pro audio high-power loudspeaker design, though some have been tried in the past, such the Meyer-influenced “System 80” used by McCune and FM Productions in the late 1970s.

In Figure 5, we’ve fixed the amplitude response anomaly with a simple pair of parametric EQ filters (one cut and one boost). Lo and behold, the phase response has been fixed as well! When one listens to compare the sonic output, with and without these two filters, the difference is profoundly slanted towards the much better sounding “fixed” version. 

Figure 5: The top trace depicts the acoustic response after the application of the two parametric EQ filters mentioned in the text, while the bottom trace reflects the improved phase response. The H and V scaling is the same as in Figure 4. (click to enlarge)

Although the phase response could still benefit from additional filters to flatten it throughout its operating range (Figure 6), it now follows a gradual trajectory instead of exhibiting a distinct ripple, as it did prior to the introduction of the parametric EQ filters. 

Figure 6: The top trace depicts the acoustic response after the application of the two PEQ filters as in Figure 5, while the bottom trace shows the electrical response of the two PEQ filters, right out of the DP548.  The overall system response can be further flattened by adding additional filters throughout the loudspeaker’s operating range. (click to enlarge)

Though a great many factors determine how a given system will sound, the informed usage of DSP to optimize loudspeakers should be high on the list of tools that can improve your craft.

Ken DeLoria is senior technical editor for Live Sound International and has had a diverse career in pro audio over more than 30 years, including being the founder and owner of Apogee Sound.

{extended}
Posted by Keith Clark on 03/13 at 04:36 PM
AVFeaturePollStudy HallAVLoudspeakerMeasurementProcessorSound ReinforcementPermalink

Rational Acoustics Appoints Chris Tsanjoures As Application Support Specialist

Rational Acoustics has announced the addition of Chris Tsanjoures in the role of applications supports specialist, where he will oversee all application support functions, including the creation and dissemination of customer support materials regarding application and use of the company’s products. 

Tsanjoures will also provide application-based customer phone, e-mail and web support, work with partner companies to create integrated Smaart Measurement Technology-based solutions, and provide tradeshow support and external training support.

Tsanjoures holds a BA in Music Production & Technology from the Hartt School of Music at the University of Hartford. He also works as a freelance sound engineer throughout New England, including at Foxwoods Casino and Resort. 

In his free time he records and produces local musicians, including a current project of engineering, producing and mixing an LP for the Amherst, MA-based alt rock band, Brotherhood of Thieves.  Before settling down and deciding to get a day-job, Tsanjoures spent the past two years on tour throughout the US and Canada as the lead guitar player for the Portland, OR-based melodic metalcore band, It Prevails.

Apart from being “stoked to be part of Rational Acoustics,” Tsanjoures would like the industry to know that “Star Trek: The Next Generation is, in my opinion, the best incarnation of the franchise. Jean-Luc Picard is the ultimate captain.”

“This is a really exciting time for Rational Acoustics,” says Karen Anderson, Rational Acoustics COO.  “We are expanding our product offerings and our technology solutions and with that growth comes the ability to add incredibly talented people like Chris to our team. He’s not only a highly skilled live sound technician and recording engineer, but also a passionate musician who brings a new perspective and skill-set to the support of the Smaart user community. And he’s totally right about the Star Trek thing.”

Tsanjoures is based at Rational Acoustics’ Putnam, CT headquarters and can be reached at .(JavaScript must be enabled to view this email address).

Rational Acoustics

{extended}
Posted by Keith Clark on 03/13 at 11:19 AM
AVLive SoundChurch SoundNewsPollAVAudioBusinessManufacturerMeasurementPermalink

Monday, March 12, 2012

Proper Signal (Time) Alignment - How To Get Loudspeaker Drivers In Sync

Understanding both filters and phase to properly signal-align HF and LF drivers

Since the early 1980s, the term “time alignment” has been tossed around pretty freely, and with dubious degrees of accuracy.

Note, however, that this was far from the first time the concept was noticed. Indeed, the engineers who provided the loudspeakers for the first “talkie” film, Al Jolson’s The Jazz Singer, noticed that the “taps” of the tap dancing scenes came out of the high-frequency (HF) horn and folded-horn low-frequency (LF) woofer at different times.

Ever since, designers have been trying to time-align loudspeakers. The phrases “time align,” “time aligned,” and “time alignment” are trademarks of E. M. Long, the inventor of the famous UREI 813 monitor loudspeaker used in recording studios. Thus for purposes of this discussion, we’ll use generic term “signal alignment” to avoid having to use those ®’s and ™’s.

Most folks believe that signal alignment between drivers in a loudspeaker cabinet is a matter of measuring the difference in distance to the front of the cabinet from each of the driver’s voice coils. Then, by adding delay to the driver closest to the cabinet ­ delay that corresponds to the difference ­ the signal of all drivers will then be aligned properly.

However, this is not correct! We must understand both filters and phase to properly signal-align HF and LF drivers.

All filters “rotate” phase, causing a positive “phase-shift” to the frequencies that pass through them. Because 360 degrees of phase-shift equals one wavelength, and wavelength can be described in terms of distance or time, any phase-shift at a given frequency can be described as a signal delay of an exact length of time.

For example, because 1000 Hz is 1000 cycles per second, one wavelength (or cycle) is 1/1000th of a second, or 1 millisecond (ms). Therefore, 360 degrees of phase-shift at 1 kHz is 1 ms of delay. Then 180 degrees of phase-shift (1/2 wavelength) is 0.5 ms of delay and 90 degrees (1/4 wavelength) is 0.25 ms of delay at 1 kHz. For 2 kHz, because the wavelength of a cycle is one-half as long, then the phase-shift delays would all be one-half the time of delay. At 20 Hz, 180 degrees of phase-shift (1/2 wavelength) is 25 ms, or 28.25 feet of delay at the speed of sound.

GETTING TO THE POINT

Where am I going with this? All crossover and equalization filters are electronic filters that impart phase shift/delay to any signal passing through them. Likewise, all loudspeakers are acoustic filters that also impart signal delay.

So in order to signal-align a woofer and horn driver (or tweeter), we must offset not only the physical difference in distance from the drivers to the cabinet front, but we must also offset the filter phase-shift delay of the crossover, the post-crossover equalization filters exclusive to each driver, and the loudspeakers as acoustic filters. Pre-crossover equalization filters are not considered because they impart the same delay to both drivers.

So let’s put all this newfound knowledge to work and signal-align a two-way loudspeaker system comprised of a 12-inch woofer (LF section) and a 90-degree by 40-degree horn/compression driver (HF section).

Before beginning, however, make sure that both drivers are in absolute polarity, or at least in relative polarity to each other. This can be done by checking the wiring, or using a polarity checker without any EQ or crossover filtering on either driver, or by checking the impulse response for a first positive swing with a measurement system.

Figure 1 shows the individual frequency responses of both the LF and HF sections with the measurement mic directly on-axis, halfway between the drivers’ centers at a distance five times the woofer’s diameter. Note that I have equalized each section flat, past the intended crossover frequency, before adding 24 dB/octave (4-pole) Linkwitz-Riley (L-R) crossover filters.

Figure 1: Individual LF & HF frequency responses with 24dB/octave L-R crossover filters at 1 kHz. (click to enlarge)

I find that equalizing the drivers first, using post-crossover filters exclusive to each driver, provides the smoothest frequency response through the crossover region once their responses are combined. It also enables the crossover filters to combine much closer to their theoretical ideal.

Also note that where the response curves intersect is the acoustical crossover frequency and, for signal alignment, this point should be 6 dB down for a 4-pole filter. To get this, one must make sure the levels of each driver are the same and then play with the electronic crossover frequencies until the acoustic results are what is desired.

In this case, I wanted a 1 kHz crossover, and to get it, both drivers ended up with a 950 Hz electronic crossover. Remember, the electronic cross-over frequency is in series with, and modified by, the EQ filters and acoustic filters (read loudspeakers) that produce the acoustic result that really counts.

GOING FURTHER

Figure 2 shows the combined response of both drivers superimposed over the individual responses. Note the cancellation at crossover with a slight addition at 600 Hz. The 11 dB dip indicates the need for signal alignment of these drivers because they are reproducing the same frequency out-of-phase, and thus cancel each other’s output. Equalization cannot fix this because it will affect both drivers equally and the cancellation will still occur.

Figure 2: Combined response of both drivers with an 11 dB dip at the crossover frequency. (click to enlarge)


Figure 3 adds the phase curve of the combined response. Note the abrupt change in the slope of the phase curve at the crossover. This also indicates the driver misalignment that causes the dip in the response.

Figure 3: Combined response of both drivers with the phase curve showing an abrupt slope change at the crossover misalignment. (click to enlarge)

At this point, most who perform signal alignment would simply begin adding delay to the closest driver and watching the phase curve until its slope would be as straight (straight ­ not flat) as possible. If you only have an RTA and can’t measure phase, then you’re out of luck. This can also be a rather tedious task because the last several delay steps to either side of optimum alignment can look almost the same.

This might not matter from a frequency-response point of view, but this alignment also determines the aiming of the on-axis lobe at the crossover frequency. To get the lobe exactly perpendicular to the cabinet face, it’s best to attain the best alignment setting at the measurement microphone position.

The easiest method to find this exact alignment setting can also be employed by an RTA (real-time analyzer).

Reverse the polarity of the HF driver (polarity, not phase). Then start increasing the delay to the closest driver ­—in this case, it’s the woofer.

Look for the maximum cancellation at crossover. Unlike the straight phase slope method, it will be very easy to determine the delay step with the maximum null. It will be a 30 to 40 dB deep dip. The dip, even one step under or over the optimum delay, will be smaller by several dB.

GETTING LUCKY

Figure 4 compares the combined response with the HF, both in and out of polarity. Just by luck, the reversed-polarity response looks very flat.

One might be tempted to stop here and use the system as is. And before the advent of DSP (digital signal processing), that is exactly what was often done. Passive crossover networks internal to loudspeaker systems are often 12 dB/octave (2-pole) crossovers.

Figure 4: Combined response with both drivers in polarity (dip) compared to combined response with HF driver in reversed polarity (flat).(click to enlarge)

A 2-pole crossover produces a 3 dB roll-off at crossover and the drivers are 180 degrees apart in phase. Reversing the polarity of the HF section puts them in phase with a 3 dB bump at crossover. Many loudspeakers with passive crossovers are designed this way.

The important question at this point: can you hear the difference between absolute polarity and reversed polarity signals? The short answer is that if the signal is a very asymmetric waveform, you can, and if it’s a very symmetric waveform, you can’t.

So unless you listen to nothing but flute solos, you’ll want to take advantage of modern DSP capability to provide optimized crossovers with both drivers in proper polarity. Note that Figure 5 shows the slope of the phase of the reversed HF combined response breaking subtly at the crossover frequency, indicating some misalignment.

Figure 5: Combined response with HF polarity reversed.  Note the slight break of the phase curve slope at crossover.  (click to enlarge)

Figure 6 graphically illustrates the process of finding the null at crossover with the HF driver polarity reversed. The dip at crossover is 37 dB deep at the optimum LF delay of 0.417 ms. Note that it is 10 dB deeper than the next closest delay step of 0.396 ms.

Figure 6: Finding the deepest null with the HF driver polarity reversed. (click to enlarge)

Figure 7 depicts the phase curve of the deepest null. It’s a perfectly vertical line, indicative of being right at 180 degrees out-of-phase.

Figure 7: The phase slope of the deepest cancellation null is a perfectly vertical line, indicating exactly 180 degrees out-of-phase. (click to enlarge)

Once you’ve found the delay step that produces the deepest null with the HF driver polarity reversed, simply put the HF driver back in proper polarity. Your system is now in proper signal alignment.

Figure 8 is the final result. Compared to the reversed HF response of Figure 5, the phase curve slope is straighter through the crossover region and there is also no slight HF cancellation dip in the woofer’s response around 600 Hz either.

Figure 8: Final signal alignment with HF in proper polarity. (click to enlarge)

If you have a measurement system that measures phase, be sure to confirm that the final resulting phase slope is a straight line. This ensures not being one cycle off in either direction by delaying the wrong driver, or by delaying the right driver 360 degrees too much, or too little at short wavelength crossover frequencies. The frequency response results could look the same. One should be particularly careful about this if using an RTA with no phase measurement capability for confirmation.

And by all means, have fun!

John Murray is a 30-plus-year pro audio industry veteran, working for EV, MediaMatrix and TOA. He has presented two AES papers, chaired three SynAudCon workshops and is a member of the TEF Advisory Committee and ICIA adjunct faculty.

 

{extended}
Posted by Keith Clark on 03/12 at 06:02 PM
Live SoundFeatureStudy HallLoudspeakerMeasurementProcessorSignalPermalink

Wednesday, March 07, 2012

Hear The Soundtrack Of A Super-Quake (Includes Audio & Video)

An interesting “audio related” item we ran across on YouTube today:

“This recording of the 2011 Japanese earthquake was taken near the coastline of Japan between Fukushima Daiichi (the nuclear reactor site) and Tokyo. The initial blast of sound is the 9.0 mainshock.

“As the earth’s plates slipped dozens of meters into new positions, aftershocks occurred. They are indicated by “pop” noises immediately following the mainshock sound. These plate adjustments will likely continue for years.

“Georgia Tech Associate Professor Zhigang Peng has converted the seismic waves from last year’s earthquakes into audio files. The results allow experts and general audiences to “hear” what the quake sounded like as it moved through the earth and around the globe.”

By the way, for more information and audio clips, go to the Georgia Tech website here.

{extended}
Posted by Keith Clark on 03/07 at 05:39 PM
AVLive SoundRecordingChurch SoundNewsBlogVideoAudioMeasurementSignalPermalink

Tuesday, March 06, 2012

+/-3 dB or -6 dB: What’s the Difference?

The meaning of both specs and a basis for comparing loudspeakers

The terms +/-3 dB and -6 dB are frequently (and erroneously) used interchangeably to characterize the frequency response of a loudspeaker system.

This has led to understandable confusion among consumers who may believe that a +/-3 dB specification is more rigorous than a -6 dB specification.

The purpose of this document is to explain the meaning of both specifications as they are commonly used (or misused) in pro audio today, and to provide a basis for comparing loudspeakers with differing stated specifications.

The term “+/-3 dB” originally expressed flatness – not high and low frequency extension. One could say, for example that “my speakers are flat – within +/-3 dB between 110 Hz and 18 kHz.” This means that between two frequencies, a frequency response graph of these speakers would not deviate by more than 3 dB in either direction from a straight line.

Figure 1 illustrates such a window superimposed on the frequency response curve of a small-format loudspeaker. Note that the response never rises above the window and falls below the window at 110 Hz and 18 kHz. This is a correct usage of the term +/- 3 dB. It is not how loudspeaker manufacturers use the term now.

image

The term “-6 dB” is meaningless without being referenced to something. And that something is the sensitivity of the loudspeaker which is typically expressed as “xx dB SPL, 1 watt @ 1 meter.”

Figure 2 shows the same loudspeaker represented in Figure 1. This loudspeaker has a sensitivity of 85 dB SPL, 1 watt @ 1 meter. The Frequency Response (-6 dB) figures are the points where the response curve crosses below 79 dB. So the Frequency Response (-6 dB) of this loudspeaker is 73 Hz – 20 kHz.

image

If +/-3 dB is an expression of flatness – not frequency response – what does it mean when used to characterize frequency response? To answer that question let’s look at the published specifications of a typical loudspeaker made by a manufacturer who labels Frequency Response as “+/-3 dB.”

The loudspeaker is specified as:

• Sensitivity (1 watt @ 1 meter) 99 dB SPL
• Frequency Response (+/-3 dB) 50 Hz – 20 kHz

Is the manufacturer trying to express the flatness of the loudspeaker? Figure 3 shows a +/-3 dB “flatness” window laid over the frequency curve so that the broadest possible range falls inside the window. In this instance, the curve crosses below the -3 dB line at 59 Hz and at 2.8 kHz. So clearly this loudspeaker is not flat (within +/-3 dB) from 50 Hz – 20 kHz.

image

Perhaps the manufacturer means that the +/-3 dB window is referenced to the 99 dB SPL sensitivity of the loudspeaker. In Figure 4 the window has been vertically centered at 99 dB. This yields a “frequency response” of 90 Hz – 2.5 kHz although the response does rise back into the window at 5 kHz.

image

In any event, this doesn’t come close to the claimed “Frequency Response (+/-3 dB) 50 Hz – 20 kHz.” If one ignores the slight dip at 100 Hz, it would be accurate to say that the “Frequency Response (-3 dB) is 90 Hz – 2.5 kHz.” It is unclear how to best characterize the region between 5 kHz and 20 kHz.

The only remaining option is to move the window so that -3 dB line of the window intersects the frequency curve at 50 Hz (Figure 5). Remember, the decibel always has to be referenced to something. If the response of the loudspeaker is 3 dB lower at 50 Hz, the question “3 dB lower than what?” has to be answered. In this case, the answer appears to be “3 dB lower than 96 dB SPL.”

image

But where did 96 dB SPL come from? 96 dB is 3 dB below the 99 dB nominal sensitivity of the loudspeaker. So this manufacturer is saying that “at 50 Hz, the frequency response is 3 dB plus another 3 dB below our sensitivity rating”. As you can see, none of this makes any sense.

In reality, this manufacturer is publishing a -6 dB Frequency Response specification and calling it out as a +/-3 dB specification. This is true of every other manufacturer surveyed as well.

The +/-3 dB specification originally was intended to express the flatness of a loudspeaker. Many manufacturers use it incorrectly to characterize the frequencies at which the response curve falls 6 dB below the nominal sensitivity. As commonly used +/-3 dB frequency response and -6 dB frequency response specifications should be understood to mean the same thing.

By the way, QSC publishes -10 dB Frequency Range specifications and -6 dB Frequency Response specifications. We have also published some -3 dB Frequency Response specifications that are true indications of the point at which the response falls 3 dB below the loudspeaker’s nominal sensitivity.

During the research for this document, at least one other manufacturer was found to be publishing a “-3 dB Frequency Response” specification. Examination of their published frequency curves indicated that this was in fact a -6 dB specification. Hopefully these errata will be corrected.

Finally, no matter how frequency response is expressed it is impossible to characterize with a pair of numbers. Such a specification can serve as only an extremely coarse indication of loudspeaker performance.

Gerry Tschetter works with QSC Audio.

{extended}
Posted by Keith Clark on 03/06 at 02:15 PM
AVFeaturePollStudy HallAVLine ArrayLoudspeakerMeasurementPermalink

Thursday, March 01, 2012

In-Depth Primer On Speech Intelligibility In Sound Reinforcement

What it is, what affects it and how it’s measured

Section 1: Introduction

Most people have had this experience:

You’re driving along in your car, windows down and the radio playing. It’s a new song, one you’ve never heard before by an artist you don’t recognize, and you’ve got to get the name so you can buy the disc. The music ends, the announcer comes on and . . .

. . . you can’t understand him over the road noise.

As this simple example illustrates, there’s an important difference between music and speech. The brain is capable of “filling in” a fair amount of missing information in music, because there’s a high degree of redundancy. (If you didn’t get the bass line in the first four measures, you’ll pick it up when it repeats in the next four.) But speech is rich in constantly-changing information and has less redundancy than music. If even a modest percentage of the information is garbled or missing, the brain can’t decipher the message.

Speech communication systems therefore are subject to more stringent requirements than music systems. These pages discuss speech intelligibility in sound reinforcement - what it is, what affects it and how it’s measured.

The Speech Signal

Human speech is a continuous waveform with a fundamental frequency in the range of 100-400 Hz. (The average is about 100 Hz for men and 200 Hz for women.) At integer multiples of the fundamental are a series of changing harmonics called “formants” which are determined by the resonant characteristics of the vocal tract.

Formants create the various vowel sounds and transitions among them. Consonant sounds, which are impulsive and/or noisy, occur in the range of 2 kHz to about 9 kHz. (Below is a vocal spectrum graph for male and female speakers with an “idealized” human vocal spectrum superimposed.)

image

The sound power in speech is carried by the vowels, which average from 30 to 300 milliseconds in duration. Intelligibility is imparted chiefly by the consonants, which average from 10 to 100 milliseconds in duration and may be as much as 27 dB lower in amplitude than the vowels. The strength of the speech signal varies as a whole, and the strength of individual frequency ranges varies with respect to the others as the formants change.

Speech Comprehension

The listener’s challenge is to parse speech sounds into meaningful units of language - a complicated task. Gaps in the sound don’t necessarily correspond to word or syllable breaks. Speech sounds also are not discrete events: rather, they merge and overlap in time, and the articulation of a given phoneme differs in different contexts and with different speakers.

In fact, the precise ways in which the ear-brain mechanism decodes speech remain something of a mystery. Such factors as loudness, duration and spectral content certainly affect speech perception, but how they may interact is not fully understood.

Diminished intelligibility is associated with a loss of information that is coded in a number of highly interactive elements, and many factors influence it. Background noises can mask the speech. Both the direction of the source, relative to the listener, and the direction of the interfering noise can alter the degree of masking. Intelligibility is also affected by the predictability of the message, the speaker’s enunciation and, not least, the acuity of the listener’s hearing.

Go To: Section 1  Section 2  Section 3  Section 4  Section 5 

Section 2: Factors That Affect Intelligibility in Sound Systems

The goal of a speech reinforcement system is to deliver the speaking voice to listeners with sufficient clarity to be understood.

Given the complexity of the speech signal, the task of providing high-quality speech reinforcement in real-world, less-than-ideal conditions is doubly complicated.

Below is a diagram (Figure 1) of a simplified speech reinforcement system showing the main factors that affect intelligibility.

As the diagram indicates, a number of acoustic, electromechanical and electronic factors need to be considered if intelligibility is to be maintained. In order to deal with all of these factors effectively, one must understand how each affects the speech signal.

Masking

The most common obstacle that speech system designers face is the intrusion of unwanted sounds that inevitably interfere with the speech signal. The effect is called “masking,” — a general term that covers a very wide variety of situations.

Figure 1 (click to enlarge)

Masking noise can come from acoustical sources such as ventilation equipment, traffic, crowds and commonly, reverberation and echoes. It can also arise electronically from thermal noise, tape hiss or distortion products. If the sound system has unusually large peaks in its frequency response, the speech signal can even end up masking itself.

One relationship between the strength of the speech signal and the masking sound is called the signal-to-noise ratio expressed in decibels. Ideally, the S/N ratio is greater than 0 dB, indicating that the speech is louder than the noise. Just how much louder the speech needs to be in order to be understood varies with, among other things, the type and spectral content of the masking noise.

The most uniformly effective mask is broadband noise. Figure 2 is a chart showing word articulation versus S/N when the masking source is noise spanning 20 Hz to 4 kHz. Notice that the signal must be 12 dB louder than the broadband noise to achieve 80-percent word recognition.

Figure 2 (click to enlarge)

Although, narrow-band noise is less effective at masking speech than broadband noise, the degree of masking varies with frequency. Figure 3 is a chart showing word articulation versus S/N for two noise bands — 135 to 400 Hz (the fundamental frequency range of speech) and 1800 to 2500 Hz (the strongest consonant frequency range).

Figure 3 (click to enlarge)

High-frequency noise masks only the consonants, and its effectiveness as a mask decreases as the noise gets louder. But low-frequency noise is a much more effective mask when the noise is louder than the speech signal, and at high sound pressure levels it masks both vowels and consonants.

This is why the proximity effect of cardioid microphones can be so harmful to speech intelligibility: it causes the speech signal to mask itself. While cardioids are very useful for minimizing noise pickup at the source, they should always be used with a steep (12 dB/octave or greater) high-pass tuned to about 100 Hz (or higher, if the speaker’s voice range allows) so that proximity effect problems are minimized.

A human voice delivering a competing message, sometimes called a “distractor,” is also very good at masking speech — particularly at or below 0 dB S/N. In addition, the masking effect increases with the number of distractor voices. Figure 4 is a diagram comparing masking for one, two and three voices.

Figure 4 (click to enlarge)

Notice that, below 0 dB S/N, three voices become just as effective a source of masking as broadband noise. Above 0 dB S/N, however, intelligibility improves rapidly as the S/N increases. This illustrates the importance of having sufficient power in paging system to overcome crowd noise.

The direction from which a masking sound arrives, relative to the direction of the speech signal, can affect the degree of masking. If the noise comes from the same place, the masking is greatest; it decreases as the distance between the noise and the speech increases because this makes it easier for the brain to discriminate between them. The masking effect is lowest when the presentation is through headphones, with the speech in one ear and the mask in the other. (Unfortunately, we can’t take advantage of that feature in sound reinforcement).

From this discussion, we can see why reverberation is so destructive of intelligibility, especially beyond critical distance. Being itself caused by the speech, reverb mimics the speech spectrum, but generally with greater low-frequency energy.

Sufficiently long reverb and echoes — such as are encountered in cathedrals and large sports arenas — can actually function like multiple distractor voices. And by its nature, reverberant energy arrives from all angles, so it’s hard to separate from the speech using directional clues.

Frequency Response

One of the most obvious aspects of sound system performance that affect intelligibility is frequency response. Severely band-limited systems deliver speech poorly. For instance, telephones are generally limited to a 2 kHz bandwidth, and this makes it hard to distinguish between “f” and “s” or “d” and “t” sounds.

High-quality speech systems need to cover the frequency range of about 80 Hz (for especially deep male voices) to about 10 kHz (for best reproduction of consonants, which are crucial to intelligibility). Response below 80 Hz must be eliminated to the extent possible: not only do these frequencies fall below the range of the speech signal, but also they will cause particularly destructive masking at high sound levels.

It’s important, also, for the system response to be reasonably flat throughout its range. The gradual high-frequency rolloff that many reinforcement professionals favor for music applications will tend to de-emphasize consonants, which are already as much as 27 dB less loud than vowels. Likewise, prominent peaks or dips in the response can cause either self-masking or loss of consonant articulation.

Finally, the coverage of the system must be consistent throughout the intended listener area, with minimal response cancellations or off-axis dropoff in the critical high frequencies. This requirement very often dictates either a distributed loudspeaker system or carefully aimed and delayed fill speakers. Using high-Q loudspeakers will help to elevate the S/N ratio between the speech and the reverberation levels.

Distortion

Early studies of intelligibility in communication systems suggest that clipping the peaks of the speech signal, and then amplifying it to restore its peak-to-peak amplitude, improves intelligibility.

The trick works in very noisy situations because clipping generates partials that are harmonically related to the fundamental — and thus less likely to mask the speech — and because it both accentuates consonants and increases the sound power of the signal.

As such, it has been helpful for band-limited communication systems that are used in very noisy environments, such as the deck of an aircraft carrier.

The fact is, however, that clipping the signal to improve intelligibility works only in cases where the signal-to-noise ratio is very poor. Figure 5 is a chart showing word articulation versus S/N for an infinitely clipped and an unclipped speech signal. Notice that the intelligibility score for the clipped signal levels out to around 50 percent at 0 dB S/N; above about +3 dB S/N, the unclipped signal scores better.

Figure 5 (click to enlarge)

In real-life speech reinforcement systems, clipping should be avoided. Obviously, it will sound objectionable through a high-quality sound system. It also will increase the masking from any noise that is picked up by the microphone, since that noise will be clipped along with the speech.

Another type of distortion that is very destructive to intelligibility is intermodulation distortion. While it is easily controlled in the electronics of a sound system, significant IM can be generated when some types of loudspeakers (particularly two-way coaxials) are driven at high levels. IM produces sum and difference products that are not harmonically related to the fundamental frequency. As such, they have a much greater masking effect than the harmonic products of clipping.

Time Response

Perhaps because it remains poorly understood and its effects are more subtle, phase response in communication systems has received scant attention. In fact, most published research about “phase” and intelligibility actually deals with the effects of relative polarity. It’s been shown, for instance, that when speech is presented with noise over headphones, intelligibility increases by about 25% if the speech signal in one ear is inverted relative to the other ear. But this result has no application in sound reinforcement, other than for in-ear stage monitors.

Go To: Section 1  Section 2  Section 3  Section 4  Section 5 

Section 3: Statistical Measures of Speech Intelligibility

Statistical intelligibility measurements use human beings, rather than electronic test instruments, to assess speech communication systems.

First proposed in 1910 and refined with the introduction of the telephone and the advent of electronic communication systems in World War II, such tests are still considered to be the most accurate and reliable measures of intelligibility.

While many variations are in use, this discussion deals most directly with the American National Standards Institute’s approved procedure (ANSI S3.2-1989, “Method for Measuring the Intelligibility of Speech Over Communication Systems”).

Method and Applications

The statistical measurement process uses trained, English-fluent talkers speaking standardized word lists through the communication system to trained, English-fluent listeners. The word lists are crafted to evaluate specific aspects of speech transmission; the ability of the listeners to identify individual words or word pairs indicates the quality of the transmission.

Such tests are used in a wide variety of applications, from examining the acoustics of conference rooms to evaluating intercoms for deep-sea divers. In professional sound reinforcement, statistical tests provide crucial information for architects and consultants, both in designing speech reinforcement systems and refining their performance in the field. They may also be used to evaluate the contributions that specific microphones, loudspeakers and signal processors make to speech intelligibility.

Preparation

In order for the results of any intelligibility test to be valid, those conducting the test must be well versed in experimental design and statistical data analysis. Since human subjects are central to the tests, the experimenters must also understand the psychological factors involved, including the effects of motivation and learning through repetition. Finally, they must, of course, know how to operate the sound system properly so as to avoid introducing errors. For all of these reasons, intelligibility tests invariably are made by trained consultants who specialize in the field.

The tests use a minimum of five talkers and five listeners; larger subject groups reduce the margin of error. Talkers and listeners are selected to assure a representative cross-section of age and gender.

All must speak English as their first language and have normal hearing. Talkers must have good articulation, and are trained both to speak at a consistent level and to synchronize their words with timing signals so that the rate of presentation doesn’t skew the test results in any way. Listeners must have good discrimination, and are familiarized with all the test words that will be used, the sound of each talker’s voice and the method of recording responses.

A number of specialized word lists are in common use for testing various aspects of speech communication. The ANSI standard specifies three:

—The Modified Rhyme Test
—The Diagnostic Rhyme Test
—The set of twenty Phonetically Balanced Word Lists

Other examples of word lists include:

—The Diagnostic Alliteration Test
—The Diagnostic Medial Consonant Test
—The Spelling Alphabet Test

Testing

If at all possible, the sound system should be tested under conditions of actual use: if there are potential sources of masking noise such as outside traffic or an HVAC system, these should be present during the testing and documented for the report.

It’s also important that the system gains be set to a representative sound pressure level. Pre-recorded test material can be used as long as the recording and playback equipment don’t introduce significant noise or distortion.

At a minimum, each talker is given three PB or MRT word lists - or the complete DRT list - to read. Where only one sound system is being tested, the trained subjects are first tested face-to-face or in similarly ideal conditions to establish a “control” or baseline measurement. (Under these circumstances the intelligibility should be nearly perfect.)

This score is then used as a reference to which the system under test can be compared. During testing, supplementary information such as the speed/certainty of the listeners’ responses and their statistical opinions about the sound system should be gathered.

Analyzing the Results

There are many ways of analyzing the test data depending on the characteristics of the particular word list and the variables being tested. At the least, a set of percentage scores is calculated showing the number of times words were identified correctly by each listener. Taking an average of these can produce a single overall score. If either the DRT or MRT is used, the results are adjusted mathematically to account for guessing (no adjustment is required for the PB test). Deeper statistical analyses can yield more detailed information about the sound system if undertaken carefully.

Go To: Section 1  Section 2  Section 3  Section 4  Section 5 

Section 4: Machine Measures of Speech Intelligibility

Statistical tests using trained talkers and listeners are by far the most accurate and reliable methods for intelligibility testing. Unfortunately, they are complicated to set up, time-consuming to conduct and require extensive statistical analysis to interpret.

Hence, consultants and acousticians have long sought an automated, machine-based test that could quickly and easily yield meaningful intelligibility scores for speech systems. A number of methods have emerged over the past fifty-odd years that fall into two basic categories: analyses of the reverberant field, and measurements based on signal-to-noise ratio.

Reverberation Analysis

From at least the ancient Classical period, architects have recognized that reverberation and echoes hamper intelligibility. Indeed, that realization resulted in the development of the Greek amphitheater, a durable architectural model that survives to this day.

Modern acousticians have at their disposal several different methods to test reverberation in enclosed spaces. The most commonly used of these are:

—%ALcons - a measure that’s familiar to many sound system engineers
—Direct-to-Reverberant Ratio
—Useful-to-Detrimental Sound Ratios
—Early-to-Late Sound Energy Ratio

Each of these tests can tell us something about the reverberant qualities of a space and, therefore, how intelligible speech could be in that space. Since they deal predominantly with reverberation, however, they fail to take into account the majority of the factors that can affect a speech reinforcement system’s performance.

Signal-to-Noise Methods

With the advent of electronic communication systems and their complex potential problems, acousticians and engineers recognized that different machine testing approaches were needed.

Beginning as early as the 1940’s with telephony research at Bell Laboratories, several instrument-based tests have evolved, each of which relies on signal-to-noise measurements in one form or another. They are:

—AI - Articulation Index
—STI - Speech Transmission Index
—RASTI (another measure that’s familiar to some sound system engineers)
—SII - Speech Intelligibility Index

AI is now of interest chiefly for having demonstrated the relative importance of different frequency bands in the speech spectrum; because it doesn’t effectively account for reverberation, it has been largely superseded by the newer methods. Of these, only RASTI is available in a simple, reasonably-priced instrument.

SII (which is proposed as ANSI standard S3.5-1997) is the most robust of the machine intelligibility measures, but it requires sophisticated equipment and the calculations that it entails are quite complex. Given the prodigious computing power that’s now available at reasonable cost, however, a practical, affordable SII instrument could soon become a reality.
Limitations of Machine Measures

Their relative convenience notwithstanding, all machine-based intelligibility measures have inherent limitations.

Every machine testing method requires that the operator have significant experience and analytical skill if the results are to be accurate and useful. It can be very difficult to identify inaccurate or misleading scores and determine their causes. Most significantly, adjustments to the system that improve intelligibility may not positively affect the measured score - and adjustments that improve the measurements may not enhance intelligibility.

In addition to these factors, each testing method has its own particular limitations that must be weighed both when carrying out the tests and when interpreting the results.

%Alcons

Percentage Articulation Loss of Consonants. This machine measure of intelligibility is closely associated with the TEF sound analyzer. It is computed from measurements of the Direct-to-Reverberant Ratio and the Early Decay Time using a set of correlations defined by SynAudCon, and is specified in percent.

Since %ALcons expresses loss of consonant definition, lower values are associated with greater intelligibility. It is generally assumed that the maximum allowable value for typical paging applications is 10%, assuming that the environment is relatively free of masking noise. For learning environments and voice warning systems, the desired value is 5% or less.

The %Alcons method is widely used by acoustical consultants (particularly in the United States), but it has significant drawbacks. First, it is based on measurements in a single one-third octave band centered on 2 kHz; all other frequencies are ignored, so the system’s frequency response must be verified in some other way for the %Alcons score to be meaningful.

Moreover, the method does not account for many factors that can dramatically affect intelligibility, including signal-to-noise ratio, the background noise spectrum, distortion, late reflections or echoes, system frequency response, compression, non-linear phase, equalization and acoustic power. %Alcons measurements of sound systems therefore often yield overly optimistic scores. Where reverberation or strong, late-arriving reflections are the primary problem, however, they can sometimes be more useful and accurate than RASTI.

Direct-to-Reverberant Ratio

The ratio between the intensities of the direct sound and reverberation. There are several measures for this quantity. C50, one of the most popular, expresses speech clarity as the energy ratio of the first 50 milliseconds of direct sound to the overall steady-state reverberation, with 0 dB being the minimum acceptable value and +4 dB or above preferred.

A similar measure, C7, is used in Germany; C35 is yet another version. Measurements are made in a single frequency band (usually centered on 1 kHz). Each of these measures can be more reliable and repeatable than %ALcons, which also deals with the direct-to-reverberant ratio.

Useful-to-Detrimental Sound Ratios

The logarithmic ratio between the energy of sounds that are useful to intelligibility and those that are detrimental to it, expressed in decibels.

“Useful” sounds are the integrated energy of speech sounds arriving within the first 50 or 80 milliseconds after the direct sound, and “detrimental” sounds are the sum of later-arriving speech energy and ambient noise. In practice, both quantities may be found by integrating appropriate portions of the room impulse response.

Early-to-Late Sound Energy Ratio

Proposed in 1996 by G. Marshall, ELR is similar to C50 but is weighted for speech and incorporates measurements in more than one frequency band. As with other direct-to-reverberant methods, however, factors other than reverberation are not accounted for.

AI

One of the earliest attempts to measure by machine the intelligibility of a speech transmission system, the Articulation Index was developed by Bell Telephone Laboratories in the 1940’s.

AI is based on the idea that the response of a speech communication system can be divided into twenty frequency bands, each of which carries an independent contribution to the intelligibility of the system, and that the total contribution of all the bands is the sum of the contributions of the individual bands. (AI may also be measured using one-third octave or octave bands.) Signal-to-noise ratios are computed for each individual band, then weighted and combined to yield an intelligibility score.

The AI varies in value from 0 (completely unintelligible) to 1 (perfect intelligibility). An AI of 0.3 or below is considered unsatisfactory, 0.3 to 0.5 satisfactory, 0.5 to 0.7 good, and greater than 0.7 very good to excellent.

STI

Developed in the early 1970s, the Speech Transmission Index (STI) is an machine measure of intelligibility whose value varies from 0 (completely unintelligible) to 1 (perfect intelligibility).

In STI testing, speech is modeled by a special test signal with speech-like characteristics. Following on the concept that speech can be described as a fundamental waveform that is modulated by low-frequency signals, STI employs a complex amplitude modulation scheme to generate its test signal. At the receiving end of the communication system, the depth of modulation of the received signal is compared with that of the test signal in each of a number of frequency bands. Reductions in the modulation depth are associated with loss of intelligibility.

RASTI

Rapid Speech Transmission Index, an machine method of testing for intelligibility in sound systems that is associated with Brüel and Kjaer, the instrumentation company that manufactures a portable device to implement it.

RASTI was developed as a simpler alternative to the more complex STI (Speech Transmission Index). In contrast to STI, RASTI measures only in two octave bands centered at 500 Hz and 2 kHz, respectively. It uses a speech-like excitation signal and, like STI, correlates reductions in modulation depth to loss of intelligibility.

RASTI has been implemented in a simple, portable instrument that can make very rapid intelligibility measurements, both acoustically and with an installed sound system. For this reason, it has been adopted for a number of European standards and civil system specifications. Being a radically simplified version of STI, however, it suffers compromises that have forced reevaluation of those standards.

For example, RASTI tests in only two frequency bands, with the assumption that the sound system’s response actually extends in a reasonably flat fashion from 100 Hz or lower to 8 kHz or higher. While this might well be the case in a properly-designed auditorium system, many types of paging systems fall short of such performance. In these cases, RASTI almost invariably gives an overly optimistic picture. (In fact, a sound system that reproduced only the two frequency bands in question could receive a perfect rating.)

Moreover, because it affects modulation depth, any compression or limiting in the system can cause an artificially low RASTI value - despite the fact that it may, in actuality, be acting to enhance intelligibility. RASTI also does not take system distortion or non-linear amplitude and phase into account.

SII

Derived from and in essence identical to STI, SII is the method for by machine measuring speech intelligibility that is currently proposed in draft form as ANSI Standard S3.5-1997.

In the Standard, four measurement procedures are allowed, each using a different number and size of frequency bands. In descending order of accuracy, they are:

—Critical band (21 bands)
—One-third octave band (18 bands)
—Equally-contributing critical band (17 bands)
—Octave band (6 bands)

The value of SII varies from 0 (completely unintelligible) to 1 (perfect intelligibility).

SII is a highly capable testing method that, under the right conditions, shows good correlation with statistical tests. It features both wide bandwidth (150 Hz to 8.5 kHz) and, especially in the critical band procedure, far greater resolution than any other method. SII properly includes reverberation, noise and distortion, all of which are accounted for in the modulation transfer function. Experienced test operators can go beyond generating a single intelligibility score to diagnosing the source of a loss in intelligibility.

Under certain conditions, however, SII can yield misleading results. In particular, late-arriving reflections and echoes can distort the measurement significantly. Like RASTI, SII is susceptible to giving artificially low intelligibility scores if compression or limiting is introduced in the system. And because even the critical-band procedure ignores frequencies below 100 Hz, it may very well miss significant low-frequency masking sources.

Finally, SII does not take non-linear phase into account. Nonetheless, when used correctly by a skilled operator, it remains the most reliable and accurate of the machine methods.

Go To: Section 1  Section 2  Section 3  Section 4  Section 5 

Section 5: Future Directions

Despite their inherent limitations, all of the machine testing methods that we’ve discussed can show good agreement if the system under test is reasonably well behaved.

But intelligibility testing is most consequential (and potentially most useful) when the system has problems severe enough to impair speech transmission. Such problems can arise from a variety of sources and conditions, many of which can “fool” any of the machine testing methods.

Contemporary sound systems are sophisticated complexes of diverse, interacting components. As the simplified diagram in Figure 6 illustrates, they invariably include signal processing elements whose effects on intelligibility, and on the instruments designed to measure it, may be difficult to predict.

While the consequences of relatively simple analog processing (such as equalization and limiting) generally are benign, the same may not be true of new, powerful digital signal processing technologies.

Figure 6 (click to enlarge)

For example, much attention is now focused upon using DSPs to “deconvolve” the response of a space in order to suppress echoes and subtract or add reverberation. Because the algorithms that are involved affect the time order of the signal, there may be large consequences if these devices are misadjusted. Furthermore, if speakers are repositioned, or the acoustics of the space changes (when a curtain is closed, for example), then the particular deconvolution likely will no longer be valid and may, in fact, cause very destructive effects.

None of the present machine measures for intelligibility accounts for time distortions. In fact, we could conceive of a hypothetical system that reversed the time aspect of a signal, like playing a tape backward: no machine method would show any decrease in the intelligibility score for such a system, though it would obviously render speech unintelligible.

What’s needed is an analyzer that’s sufficiently “smart” to detect all of the factors which affect intelligibility, and render a conclusive judgement, without relying heavily on the operator’s interpretation. But the unavoidable truth is that, as sophisticated as machine-based measurement systems may be, they cannot yet approach the complexity of the human ear/brain mechanism informed by a lifetime of experience decoding speech.

We can only model those aspects of that exquisitely fine-tuned mechanism that we have come to understand. The many remaining questions regarding how it works and what factors may affect it can only be answered by further research.

These papers were written by Ralph Jones, edited by Rachel Murray, P.E., and provided by Meyer Sound.

Go To: Section 1  Section 2  Section 3  Section 4  Section 5 

{extended}
Posted by Keith Clark on 03/01 at 09:47 AM
AVFeaturePollStudy HallAVMeasurementProcessorSignalSound ReinforcementPermalink

Thursday, February 23, 2012

Mixing Two Worship Services At Different Sound Pressure Levels

What do you do when different generations demand that sound levels vary greatly from service to service?
This article is provided by Jeremy Carter.

 
At one of the churches I’m currently working with, the situation is all too common. It’s a large multigenerational congregation with a lot of history. As such, the attendance trend is for an older crowd to populate the early service, with the later service being dominated by younger families.

This is not surprisingm as the same thing is happening all across the country, evidenced by numerous posts by technical directors on their blogs.

Needless to say this can create some problems. One which can’t be avoided is the differing SPL preference of these two unique groups.

I’ll be honest, I’m using the word preference pretty lightly here. Anyone else who has this problem knows it’s much stronger than that.

Often the older generation becomes pretty vocal in their displeasure when the volume gets too much for them. Conversely, for some in the second service the volume never gets loud enough. If they don’t feel the kick in their chest, they’re not happy!

So what is a sound tech to do? Unfortunately, there just isn’t time to completely rebuild the mix between services and a simple master fader change doesn’t work.

That makes everything lose its presence, including vocals. Yet we don’t want to get crazy with changing instrument faders or we will upset the Pyramid mixing technique.

So we experimented a bit this week. Here is the test method we utilized:

1. During the first half of rehearsal, we built a normal Pyramid Mix as we normally would. We started with the drums and bass, mixing them at the appropriate level for the acoustic volume of the drums in the space. Then we move up the pyramid until we achieved a great mix.

2. For the last half of rehearsal we then reduced the master fader 6 dB.

3. Then at the subgroups, we pulled up the vocals and piano groups 3 dB.

4. Finally, we (politely) asked the drummer to “ease up” on the first service.

So we have nearly “halved” the apparent volume of the entire mix (halving being recognized as a reduction of 10 dB compared to our reduction of 6 dB), while compensating the vocals and piano a bit.

Now for the second service, we only had to adjust 3 or 4 faders. The groups we adjusted were moved back down 3 dB and then the master fader went back up to its original position. Voila, we’re back where we needed to be for the second service!

Some would suggest that we do the opposite (pull down drums, bass, guitars, keys, etc.) But for our situation, our method involved the fewest fader changes.

Are you encountering similar situations in your church?

Jeremy Carter is a veteran of the pro audio industry with extensive experience designing and operating church audio, video, and lighting systems. Learn more at Sound Sessions.

{extended}
Posted by admin on 02/23 at 03:14 PM
Church SoundFeatureBlogPollConsolesEducationMeasurementMonitoringSignalSound ReinforcementSystemPermalink

Monday, February 20, 2012

A Real World Example of Limiting Distances

A very real issue, Limiting Distance is the point at which it is difficult (if not impossible) to localize a sound source.

If acoustic parameters sometimes seem difficult to grasp, this practical example should help clarify at least one of them, the Limiting Distance, DL.

This is a parameter associated with reverberant spaces, such as gyms and churches with little absorption.

Remember that in such spaces there will exist a reverberant sound field LR, that is uniform throughout the space, as well as specific reflections that are not.

If a loudspeaker or other acoustic source is used to excite the space, there will be a localizable (you can tell where it is coming from) direct sound field L from that device for those in close proximity to it.

For a steady sound source, as the listener moves farther away from the acoustic source, the direct sound level drops, but the reverberant level stays the same.

The distance at which they are equal is called the Critical Distance, DC At about 3 times DC the direct field is almost completely masked by the reverberant field, to the point that it is difficult if not impossible to localize the sound source.

A whistle on court C often plays on court D.

This is called the Limiting Distance, DL, and can be found by DL = 3.16 DC in ft or mwhich is the point at which the LD has dropped 10 dB from what it is at DL.

Our local high school has a large, multi-purpose athletic facility where multiple sporting events are often held simultaneously, usually some combination of basketball, wrestling, cheerleading practice, etc. The room has very little absorption, and a significant reverberant field exists.

Its diffuseness is easily evaluated from the scoreboard “buzzer” that sounds to end periods of play. This is quite a nice test stimulus (covers the voice range and is long enough in duration to build a steady reverberant field). The “tail” of the decay can be evaluated for diffuseness (an “airy” character) or reflectiveness (a “mechanical” character).

The problem is that with a low directivity source (such as a referee’s whistle) the limiting distance is less than 60 feet. This means that if two basketball games are played at the same time, play often stops on both courts when the whistle is blown on either!

The players or the fans can’t localize it, because they are beyond the limiting distance and the level is the same throughout the room. As such, all directional clues are obscured. The solution for this space is to apply absorption to the corrugated metal ceiling.

This could easily have been done during design and construction, but would be very costly now. Anyone who designs a gymnasium without absorption on the ceiling (at least) should have to play in it!

Pat & Brenda Brown lead Syn-Aud-Con, conducting audio seminars and workshops around the world. Synergetic Audio Concepts (Syn-Aud-Con) has been a leader in audio education since 1973. With nearly 15,000 “graduates” worldwide, Syn-Aud-Con is dedicated to teaching the basics of audio and acoustics. For more information, go to http://www.synaudcon.com

{extended}
Posted by Keith Clark on 02/20 at 02:03 PM
AVFeaturePollStudy HallAVAudioEducationEngineerInstallationMeasurementSound ReinforcementSystemPermalink

Monday, February 13, 2012

Harnessing The Power Of Digital Signal Processing

Understanding DSP functions to improve measurable and audible results

In the previous segment, we looked at the basic process of using a high-resolution FFT (Fast Fourier Transform) analyzer to view the frequency and phase response of a 12-inch cone driver in a typical 12-inch/2-way loudspeaker.

In that segment, we established that the 30-degree off-axis response of the cone driver is substantially lower in level (12 to 18 dB), as well as highly irregular in phase and frequency above approximately 2 kHz, when compared to the driver’s on-axis response (Figure 1).

This information allows us make an educated guess at the range where the cone driver should be crossed over.

In this particular case, the 30-degree off-axis response is linear up until about 1.28 kHz, after which the output until about 2 kHz. At 2.1 kHz, the output level begins to descend rapidly as the driver enters its breakup mode (see sidebar for discussion of “breakup mode”).

Therefore, the optimal crossover could be as low as 300 to 500 Hz (for loudspeakers that employ a mid-range driver) to as high as perhaps 1.3 kHz, while still maintaining a 60-degree angle of vertical dispersion.

However, if the 12-inch cone is to be mated with a 90-degree (or wider) HF horn and driver, approximately 1 kHz should be the upper limit, as the off-axis response at 45 degrees will be much worse than at 30 degrees. Further, if the cone driver were to be 15-inch in diameter rather than 12-inch, as is common in many 2-way loudspeakers, its off-axis response will become irregular at even lower frequencies than the 12-inch cone driver, due to the larger diameter of the cone. 

Figure 1: The 12-inch LF driver displays a rapid loss of output and irregular phase response above 1.3 kHz when measured 30 degrees off-axis.

As a general rule, 12-inch cone drivers exhibit approximately a 90-degree conical pattern at 1 kHz, while 15-inch cones exhibit approximately a 60-degree conical pattern at 1 kHz. This is only a general rule because the cone geometry specific to a given model of driver is the determining factor.

The primary point is that measuring the cone driver, both on-axis and off-axis to determine its dispersion versus frequency characteristics, is the first logical step in determining an optimal crossover point.

A Word About Measuring
If your measurement microphone is roughly 1 meter from the loudspeaker, and the loudspeaker is 3 meters (or further) from any barrier surface, the measured results will be reliable in the region of the crossover. Measuring outdoors at greater distances from barrier surfaces (ground, buildings, walls, etc) is ideal for obtaining an accurate response for spec sheets, though not necessary for merely aligning LF, MF and HF drivers that operate in the 500 Hz or higher range.

In this discussion, we’ll factor in the HF driver and HF horn. The HF driver and horn will always have a low-frequency cut-off point that must be respected to avoid rapid driver damage, as well as loss of directivity from the horn. While this frequency is often stated on manufacturer data sheets, it’s far more revealing to look at the response of the HF driver and horn combination on the analyzer.

What you’ll see is some form of response curve that – you hope – stays fairly flat for a reasonable segment of the spectrum, but will suddenly exhibit a rapid roll-off in the low frequencies that does not “come back up” as the spectrum lowers. This is the combined effect of the horn uncoupling, which means it’s no longer seen as a horn by the driver, and the driver being unable to efficiently reproduce frequencies below a certain frequency.

A horn (which is classically stated to be an acoustic transformer), only behaves as a horn within a frequency range that is a function of its dimensions. Make it too long and large – or with a sub-optimal flare rate – and upper HF energy can be cancelled within the throat so that it doesn’t appear at the mouth at all. Make it too short and small, and it will uncouple early, forcing the HF driver to “flap in the breeze” as the driver is no longer acoustically loaded by the impedance of the horn.

Neither condition is desirable, but the latter can lead to early driver damage unless the crossover point is set high enough to avoid damage – but possibly resulting in a gap in response between the LF and the HF. What exactly is the danger point that can cause early driver damage? That’s what we’ll determine by measurement.

Safe For The HF
In Figure 2, the trace shows that the horn rolls-off sharply below 800 Hz. Unless we’re actually designing the loudspeaker, we don’t really need to worry about whether the roll-off is due to the horn, the HF driver, or the sum of the acoustical output of the two, as this information is more academic than “must have.” 

Figure 2: The HF horn and diver display a fairly uniform frequency response (upper trace) and phase response (lower trace) until the horn uncouples at about 800 Hz (see trace marker). Immediately below 800 Hz, the phase response makes an abrupt shift and the frequency response crashes into the noise floor of the measuring equipment.

What we do need to know is the point it’s no longer safe to send energy to the HF driver to avoid damage. This is readily determined by viewing the trace; the crossover should be set so that it’s at least 12 dB down at the “corner” frequency where the HF trace begins its rapid descent.

We also want to look at the relationship of the LF driver’s high-frequency roll-off to that of the HF driver’s low frequency roll-off. In all normal cases, the LF cone driver will roll-off gradually as the frequency increases, while the HF driver/horn combination will roll-off very rapidly as the frequency decreases. If there is not sufficient overlap between the upper region of the LF response and the lower region of the HF response, a dip or gap in the overall system at crossover will result.

While this gap might mean that the system is begging to have a mid-range device added, it’s more likely than not that you’ll need to make do with what you’ve got. This is a legitimate reason to consider applying different crossover slopes to the LF and HF, and sometimes even different crossover rates such as Bessel on the LF and Linkwitz-Riley on the HF.

That said, most attempts to “gently” sum the LF and HF at the crossover point through the use of asymmetrical slopes, asymmetrical rates, or even asymmetrical crossover points, are rarely successful.

You cannot make a 12- or 15-inch cone driver behave like a 1-inch throat (or 1.4- or 2-inch throat) compression driver and horn combination.

This is often a significant source of confusion for many who work hard to optimize their systems.

The best course of action, in nearly all cases, is to ‘get out’ of the crossover region as rapidly as possible (in respect to frequency), by using high-order crossover slopes. At crossover, a lobe – however small – is generated by the differing characteristics (size, shape, mass, and radiation pattern) of the LF and HF drivers.

This lobe is the result of predictable acoustical cancellations and acoustical summations. The sums and differences will constantly change as the frequency and angle of incidence (the angle to the listener, or the measurement mic) are altered.

When using an FFT to measure a loudspeaker on-axis where the woofer and tweeter meet – and then slowly sweep the microphone up and down (usually the tweeter is atop the woofer in most loudspeakers) – you will see how the frequency and phase response vary, usually greatly, in relation to the position of the mic.

You’ll also see that the frequency and phase response at a given angle above the loudspeaker will differ from the frequency and phase response at the same angle below the loudspeaker.

Certain Exceptions
There are certain special cases in which asymmetrical crossovers can be used to advantage. These exceptions are typically found in applications such as mastering labs or studios where only one tightly defined location is occupied by the listener.

In these types of situations, it may be possible to improve the response through the crossover region for that one location by using asymmetrical crossovers.

However, the effect of irregular frequency and phase energy from above and below the loudspeaker’s sweet spot, bouncing off of room surfaces and ultimately reaching the listener by reflection, should be carefully taken into consideration.

While some of these measurable variations may be caused by the loudspeaker enclosure’s geometry, and/or the HF horn geometry, the majority of the variations are the result of the dance of interference-and-summation of the HF and LF sources.

They will never be perfectly matched in arrival time – even when incremental delay is adjusted carefully – because their physical differences, location in the enclosure, size and mass all dictate that their acceleration and de-acceleration periods cannot be the same through the crossover region. Although the two sources may have radiation patterns that are quite close on paper, they will never be identical in the real world.   

By using high-rate crossover slopes, you’ll automatically minimize the portion of the spectrum that’s affected.

Unless other mitigating factors prevail, 48 dB-per-octave should be the starting point.

Even if there is poor summation at, let’s say, a 1 kHz crossover point, the loudspeaker will only be affected at a narrow bandwidth surrounding that center frequency.

Plus, there’s plenty that can be done to improve the summation because we don’t want to settle for even a small affected area, do we?

Especially when it falls at about 1 kHz, which is the upper region of the all-important vocal frequency range, and smack-dab in the middle of the first and second vocal harmonics.

Harmonics: Love ‘Em Or Leave ‘Em?
Without harmonics, all fundamental tones would sound much alike, distinguishable only by their pitch and their dynamic characteristics (attack and decay). Sound sources without harmonics would be like cooking pasta for a lifetime with only one flavor of sauce.

The fundamental frequency of a flute, a vocalist sustaining a note, and a string instrument would be hard to differentiate among. It’s the harmonic structure that causes a string to sound different from a woodwind, or a simple organ tone to differ from an all-out Hammond B-3 chord attack (Leslies spinning, of course).

The quality that we perceive in music and speech is the sum total of their fundamentals, their related harmonic structure, and their dynamic characteristics – and, of course, their pitch (or frequency). And so is our assessment of the quality of sound of a given loudspeaker.

So rather than getting incredibly fancy (leave this one to the mathematicians), we’ll simply look at the response of the loudspeaker on the analyzer. Remember to be careful about viewing on-axis and off-axis comparisons whenever trying different crossover points.

By adding small increments of delay to the driver that is furthest forward in the enclosure (usually the cone driver), the two sources can be aligned in the time domain to produce a flat frequency and phase response throughout the crossover region (Figures 3 and 4).  A good target is to try to contain phase shift to no greater than ±30 degrees, though it’s entirely possible to reduce the phase shift to almost zero, which is better still.

Figure 3: The upper trace shows acceptably summed LF and HF in amplitude versus frequency, while the lower trace shows a very poor phase response due to driver misalignment and crossover frequency (crossover is at 1.6 kHz).
Figure 4: Upper trace (frequency response) shows acceptable amplitude versus frequency summing while lower trace (phase) shows problems above crossover frequency (crossover is at 1 kHz).

If you’re not able to substantially reduce the phase shift at crossover by adding delay, then it’s time to break out the big guns. The inability to reduce phase shift to an acceptable level by the use of incremental delay indicates that the inherent roll-off of the LF or HF drivers (or both) is introducing a phase response alteration of it (or their) own. In cases like this, introducing an all-pass filter or a “phase” filter at the proper frequency and proper Q may be what’s needed.

To determine which driver needs help with its inherent phase shift, look at the frequency and phase response of each driver separately on the analyzer. Make sure that you‘re feeding them a broadband test signal (but not at a level that will cause damage), rather than a conditioned signal in which a crossover filter has already been introduced. You’ll quickly be able to determine what is causing the problem and then you’ll have an idea of how to attack it. 

It’s no small feat to derive crossover settings that result in a perfectly flat phase response and frequency response crossover on-axis (plan on about a full day of trial and error), but it’s a whole other thing to make sure that your settings hold up off-axis. Be prepared to spend as much as two or three days to work through the process.

In the end you may have to compromise how flat the on-axis response is to improve the off-axis response. whether or not you decided to do so should depend on how the loudspeaker is intended to be deployed. If 70 percent of the listeners are positioned off-axis, then the answer is a no-brainer.

A Word About Arrays
While arrays complicate almost everything, one aspect that can be relied upon is that multiple cone drivers do not sum well – no matter how they are configured – at frequencies that approach, or are above, their break-up mode.

This fact provides additional rationale for getting out of the shared bandwidth as rapidly as possible. Again, I’d start with 48 dB/octave slopes and here I’ll make another recommendation: Linkwitz-Riley crossover shapes are hard to beat.

You’ll often read about how different crossover topologies (Bessel, Butterworth, Linkwitz-Riley) at various orders (i.e., first order = 6 dB/octave, second order = 12 dB/octave, and so on) possess differing characteristics. Certain topologies will sum in-phase at crossover while others will not. This is interesting, but mostly academic information.

What we’re concerned with is the measured system response which includes both the drivers and the crossover function. And with today’s excellent DSP crossovers, such as the XTA DP548 that was loaned to us for this series of articles, it’s a simple process to try different crossover types and rates so that you can view the results on your analyzer.

Though the idea of using an empirical process (trial and error) may not appeal to those who would prefer to predict the results mathematically, the fact is that even the best predictions are usually off somewhat and, even if perfect, still need to be verified by means of accurate analysis.

Quick Thought On FFT Analyzers
Speed. Yeah baby, that’s it. When you have to wait while an analyzer acquires a trace and then examine it in relation to other traces, the process is slow and tiresome.

Conversely, when you can move the measurement mic up, down, left and right to your heart’s content, and see the results in virtually real time, you have a tool that allows you to accurately characterize what the loudspeaker is really doing – both on and off axis.

Speed is everything. It allows you to try numerous changes to your crossovers and related filters, and then view the results immediately. What used to take days with chart recorders or other “static” means of displaying response curves, can now be compressed into hours or even minutes.

What’s The Breakup Mode, Precious?

Breakup occurs when any driver (LF, MF or HF) transitions from pistonic motion to vibratory motion. When a driver’s movement is primarily that of a piston, that is when its mass is moving forward and rearward as a single physical unit, it tends to behave linearly (It “tends” to do this because there’s more to the overall picture just that.

Factors such as how the surround is constructed, how the spider is mechanically loaded, and how linear the magnetic circuit is, will all affect the end results). 

Conversely, when said driver stops acting like a piston and begins to vibrate like a wheel that’s grossly out of balance, that’s when it enters the breakup mode. It’s still producing energy that we can roughly characterize as sound, but it’s usually sound that we don’t want to hear.

And just like a wheel that seemingly becomes more and more unbalanced as the RPM increases, the driver “breaks up” more and more as the frequency increases.

The harmonic distortion starts to run away in terms of magnitude – even becoming unrelated to the fundamentals – which makes for some truly awful listening. Moreover, if frequencies below the breakup point are present at the same time as frequencies above the breakup point (a common occurrence in music), then a good dose of intermodulation distortion will be present too. Therefore, it’s a good thing to avoid using any driver in the part of the spectrum in which its breakup mode has occurred. 

Figure 5 shows how a 12-inch cone and a 1.4-inch HF driver coupled to a 90-degree x 40-degree horn were optimized by using the techniques described above. The loudspeaker is an iHP1294 provided by Community Professional. As you can see, the results are a far cry from simply setting the crossover to a recommended value and going to lunch.

Figure 5: Upper trace (frequency response) shows decent amplitude vs. frequency summing while lower trace (phase) shows most problems fixed by adding incremental delay to the LF driver. While the phase response still indicates significant shift, it is smooth and gradual through the crossover region (crossover is at 1 kHz).

Although it takes time to optimize a loudspeaker system, it’s time well spent, and even more so if the loudspeaker will eventually become part of an array or cluster. There’s a good reason for this: if you can’t make a single loudspeaker measure and sound good – or hopefully great! – then it will be close to impossible to optimize an array made up of multiple units of that same loudspeaker.

Ken DeLoria is senior technical editor for Live Sound International and has had a diverse career in pro audio over more than 30 years, including being the founder and owner of Apogee Sound.

{extended}
Posted by Keith Clark on 02/13 at 08:11 PM
AVFeaturePollStudy HallAVLoudspeakerMeasurementProcessorSignalPermalink

The “New” Optimization: Reconciling A Variety Of Opposing Forces

Let’s explore the implications of leveraging this technology, both the advantages and the pitfalls

The term optimization stems from the French word optimisme, meaning “the greatest good” or “the best.”

Historically, the sound reinforcement industry has employed the term in the context of sound system tuning or alignment, but it was not until recently that the use of the term optimization has taken on a more literal and formalized meaning.

It’s important to be familiar with the concept of numerical optimization (most simply, the determination of input values to obtain a function’s maximum or minimum values) and applications in live sound reinforcement. This technique is not new, and has been employed for years other industries (aerospace design, for example).

But as computer processing power increased, so too did the accessibility of these tools to users working in the field with laptops instead of mainframes. Just as acoustical measurement systems that once required racks of equipment can now be transported in a backpack, the benefits of advancement in computer technology have allowed numerical optimization to be accessible outside of the laboratory.

Only recently, however, has optimization found it’s way into the live sound reinforcement market.

Let’s explore the implications of leveraging this technology, both the advantages and the pitfalls. Though it has the potential to provide greatly improved performance (with much less user effort) from both existing and future systems, it will also require some degree of compromise and acceptance by the user.

What Are We Optimizing?
The objective in almost all cases is to balance a number of performance factors (or target variables, in our optimization problem).

Generally, these might include:

1) Consistency (or variation) of sound pressure level through a defined audience region
2) Absolute SPL in a defined audience region
3) Tonal response or consistent through a defined audience region
4) Absolute SPL outside of a defined audience region
5) Tonal response outside of a defined audience region

Examining this list, it is clear that these factors are not complementary. In fact, several of these are potentially in direct opposition, such as:

• SPL consistency versus absolute SPL in audience area
• SPL consistency in audience versus non-audience areas
• Tonal consistency versus absolute SPL in both audience and non-audience areas

How do we manually reconcile these opposing forces through the current variables, such as number of elements, splay angles, amplifier channel level or equalization with any reasonable degree of success? It is hopeless to expect a human will find the best answer unassisted. The number of variables are too great and the resources (time, mainly) too few.

Figure 1: An original illustration conveying the typical input and output parameters of an ‘optimized’ loudspeaker system. (click to enlarge)

To illustrate this, let us look at an example: We have 10 elements in an array (lets assume this number for the purposes of demonstration), each with 10 possible splay angles. If the top box remains at 0 degrees, that leaves us with only 9 (boxes) x 10 (possible angles per box) = 90 possibilities!

One might argue that we can narrow this down with experience and intuition, and let’s say that this allows you to cut down the number of iterations by even as much as 75 percent. That’s still ~22 iterations for the splay angles alone! This is prior to any sort of experimentation with gain/equalization shading or trim height selection.

At a rate of one iteration every two minutes, the user has spent the better part of an hour already just figuring out the splay angles. Repeat this for different array lengths (if the exact quantity of enclosures has not been fixed for us) and heights, and it quickly becomes impossible to look at every combination and find the best result. In most cases, the answer is to settle for a result that is less-than-optimal in the interest of time.

Fortunately, computers present another option that can yield better results in less time. This is because computers are very good at tackling problems that require a large number of different solutions to be rapidly attempted and the results compared. Because the computer is doing the legwork, for perhaps the first time the user has the convenience of defining the desired result or performance, instead of the mechanism required to achieve it.

What The Process Can Do
Looking at the first list in the previous section, one might notice that all of the typical prediction parameters that we are used to seeing are strangely absent.

Up until now, we’ve avoided talking about any of the factors that are usually manipulated by the user directly when setting up a sound system (assuming an array), such as:

1) Enclosure splay angle
2) Quantity of enclosures
3) Type or configuration of each enclosure
4) Trim height or array position
5) Signal processing or equalization (both in terms of filters chosen, and degree of control or zoning within an array)

In the new optimization process, these are the variables that the computer manipulates in pursuit of the specified target criteria. The user needs only to define the desired end- result, and the computer determines the combination of variables that best fulfill the target criteria.

No longer do we need to toil endlessly with the physical parameters of the array (except to the extent that they need to be constrained in some way) to determine how these parameters translate into acoustical performance. With the benefit of computers, we can now work backwards and avoid dealing with the mechanics directly.

Figure 2: A comparison of on- and off-axis frequency response for identical loudspeaker systems, without optimized processing (top) and with (bottom). Only electronic processing (DSP) is varied between the un-optimized and optimized data. Since the time this data was collected, advances in computing power allow optimization of an even more complex systems in shorter time. Credit: Eastern Acoustic Works (click to enlarge)

So far, we have defined the “new optimization” technique and the benefits that it can provide, both in terms of performance and ease of use. The next challenge is to define the way that the user and system interacts. How is the system to understand what performance goals are most important, and their order of priority?

The User Interface
Because we are greatly reducing the users’ involvement in the details of the decision-making process, it is critical that the program be able to accurately understand the priorities.

For the manufacturers (operating under the assumption that the product manufacturer is also creating the user interface), the challenge is to direct the user to provide the core input data in the easiest and most accurate way possible.

As an example: a program that provides open input fields with no limit on precision (i.e. the number of digits after the decimal point) for all parameters would allow the user to place arbitrarily high performance markers in every category. This doesn’t help the program because the system is attempting to balance goals and compromises; it cannot completely fulfill all of the needs simultaneously. There must be trade-offs, and it is the job of the user interface to convey these so that, when the optimization is performed, the results produced are as useful as possible.

But how is the user to understand these trade-offs? The emphasis in the interface must be on helping the user interpret their options quickly, easily and accurately. For example, the program must be able to convey that if you want the maximum amount of attenuation outside of the audience area (to minimize environmental impact and avoid a noise violation, for example), this may be at the cost of some consistency within the audience area.

The question is one of degree and not absolutes; will the user accept +/- 5 dB versus frequency variation in exchange for a 12 dB broadband reduction outside of the audience area?

What about +/- 8 dB and 16 dB, respectively? Note: these numbers do not represent predicted or measured values and are only used for the purposes of discussion.

Some manufacturers have picked up on these ideas already.

For example, one product utilizes software “sliders” to manipulate the target parameters, and makes them ‘interactive’ whereby the setting of one parameter constrains the values of the others.

Setting Priorities
In some cases, it may be useful to ask the user for absolute values (i.e., consistency of SPL within the audience area) but even these must be restricted in some way (for example, it is not possible to achieve +/- 0.00 dB variation for all frequencies with a finite-size source).

More often than not, what’s most useful is an understanding of relative priority; normalized, unitless markers of what the user considers most important.

Think of it as sending a friend to an unfamiliar market to do your shopping for you.

A typical dialogue might go like this:

Person 1: Hey, can you pick some milk up from the market for this recipe?

Person 2: Sure, how much should I get?

Person 1: As much as you can for Y dollars.

Person 2: As much as I can? You don’t care about the quality?

Person 1: Well, actually I’d prefer organic milk.

Person 2: So as much organic milk as I can get for Y dollars. What percent fat?

Person 1: Oh, right. It should be T% fat. But it’s most important that it doesn’t cost more than $Y, and is organic. Just make sure you get at least Z gallons, since that’s what the recipe I’m making calls for. Well, maybe you can go a dollar or two over but only if you get the better milk for that.

Person 2: OK, is there anything I should avoid?

Person 1: Just don’t get Brand Q – that was terrible last time we had it. If only Brand Q is available for Y dollars, get Brand H, even if it’s more expensive by 1 or 2 dollars. But if it’s more than Y+1 or 2 dollars, it’s OK to get non-organic.

Person 2 (confusedly): OK, I’ll try to remember all of that…

This may seem like an absurd example of an optimization problem, but it is in fact not so unusual. In fact, the number of variables in this example (cost, volume, brand, quality and percentage fat) are generally far fewer than a sound system prediction tool has to handle.

Yet with a similar amount of input from the user, the program needs determine the best result in the least time. This means that the user input must be highly targeted, to steer operators to accurately characterize their goals, but without asking for too much absolute specificity unless actually required (i.e., to satisfy the noise police, who are holding calibrated absolute SPL meters).

Thus there is a particularly significant emphasis on the user interface, as it is the user’s only means to access the optimization system. If the system is unable to accurately characterize the user’s priorities, the result will not be consistent with the user’s goals and will require the input parameters to be slightly adjusted and tried again (these iterations on top of the iterations that the optimization routine performs internally). But the patience that users in the working world will have for this is limited.

On the opposite side of the coin, an interface that asks the user for excessive detail only bogs both user and computer down, offsetting the advantages and convenience that an optimization offers in the first place (and likely not providing any better results) and establishing highly specific expectations that may or may not be achievable.

The interface must walk a fine balance; it is unquestionably the lynchpin of this technology.

What This Means Today
Having looked at the input and output parameters in the optimization model and the particular emphasis that this places on the user interface, perhaps it’s a good time to step back and identify what all of this means from the perspective of the end-user.

On a day-to-day basis, what does this really change when it comes to setting up sound reinforcement systems? The answer is not necessarily simple: in some ways, a lot, and in some ways, very little.

To start with, users wishing to fully leverage the benefits of numerical optimization must dispense with the “point-and-shoot” mentality, and an examination of Figure 3 (which illustrates the magnitude and phase response per array element, by frequency) should sufficiently prove this.

Practically speaking, there is no way to manually interpret or validate the output data, especially when it combines mechanical (i.e. splay angle and trim height) and electronic (i.e., signal processing per zone or array element) components. Rest assured, this makes the operators no less in control or aware of the workings of their system; instead, it frees them to think about the higher-level priorities. But clearly, the nature of the operator’s involvement must change.

Figure 3: An example of the complex processing (DSP) output of an ‘optimized’ system. Both magnitude and phase (indicated by color) per frequency (x-axis) is provided for each element (y-axis) in a sample array. Credit: Martin Audio. (click to enlarge)

A different approach is also necessary in troubleshooting ‘optimized’ systems, because the typical signs of ‘trouble’ (polarity inversions, level shift, drastic equalization, etc.) may very well be required to produce the desired performance result. What this means is that the manufacturers must implement diagnostic systems to check the basic parameters, allowing the user to understand the system’s status on a much higher level without requiring the operator to examine the system on the granular level as in years past.

In other words, the operator can never be left to wonder if all of the drivers have been correctly wired, if any are blown, or if the optimization result has an error – the system must be able to determine this for itself, and report back to the operator on both the diagnosis and prognosis. And equally important, the user must be able to place some trust in this diagnostic system, and the results of the optimization engine.

Next time we will examine a range of possible applications (both present and future) where this technology may lead.

Adam Shulman is a senior consultant with SIA Acoustics (www.siaacoutics.com), a leading acoustical and system design firm with headquarters in New York City. He has more than 10 years of experience in the audio industry as a technical system and acoustical designer, project manager, educator and writer.

{extended}
Posted by Keith Clark on 02/13 at 07:46 AM
Live SoundFeaturePollMeasurementProcessorSound ReinforcementPermalink

Sunday, February 12, 2012

Line Array High-Frequency Output Capability

Maximum acoustic output as a function of frequency

This is a discussion of line array high-frequency output capability, and what it might mean to you.

The specific quantity we’re looking at is the frequency response at maximum output power for all drivers. This is the maximum acoustic output of the array as a function of frequency. I will call it the power-bandwidth response.

Here’s a question: what power bandwidth must line array elements have in order to deliver a flat frequency response to audience seats? Because of the way array elements work together (or don’t), we can’t assume that the overall array will have a flat frequency response just because the individual boxes do.

In particular, practical applications require the array to be curved to achieve the desired coverage. Whenever a line array is curved, its output falls off in the high frequencies.

For example, in Figure 1 below, from one of my line array modeling programs, we see the frequency response of a progressively curved line array - eight boxes, interbox angles increasing from 0 to 6 degrees - at a distance of 100 feet.

The boxes are assumed to have flat frequency response.

Figure 1: Line array transfer function.

As you can see, the high-frequency roll-off is about 18 dB. This means that in order to deliver a flat frequency response at 250 feet, the output capability of the individual line array boxes must be 18 dB greater at the high-frequency end. Wow.

This is a typical case. Usually. the high-frequency boost needed is between 12 and 20 dB. More highly curved arrays require more boost.

Why?

Line arrays have lower gain at high frequencies because of the way the sound waves combine at the listening position. The vertical pattern of each individual line array box is very wide at low frequencies, but narrower at high frequencies.

Therefore, if you’re listening to a curved array, at low frequencies you’re hearing all the boxes, but at high frequencies. you only hear the boxes where you are within the vertical coverage of their horns at a given frequency. Loud bass, quiet treble.

If you’d like to see this principle set forth mathematically, I suggest reading a 2001 AES paper by Dr. Christian Hell, the physicist who crystallized the modern theory of line arrays, and his colleagues Marcel Urban and Paul Bauman. It’s entitled “Wavefront Sculpture Technology,” and is available as a .zip file for free download from the L-Acoustics website.

With such large high-frequency output requirements, it’s a good thing that compression drivers are so efficient. Or are they?

Power Bandwidth Response Limitations
We think of compression drivers as powerful high-frequency engines. In fact, they’re powerful upper midrange engines. All compression drivers roll off on the high end, above what is known as the “mass breakpoint” frequency.

The mass breakpoint frequency is the frequency above which the inertia of the diaphragm can no longer be ignored. It’s usually around 3,500 Hz.

Figure 2 shows the unequalized frequency response of a typical modern compression driver. This is an actual curve (smoothed, of course) for a modern name-brand unit with a 3-inch diaphragm, 1.5-inch throat, and neodymium magnet. The red line is the actual acoustic output of the driver.

Figure 2: Compression driver frequency response.

The magenta line shows the effect of attaching the driver to an older hom with non-constant directivity. In such horns, coverage patterns narrow in the high frequencies, squeezing the diminishing amount of treble into a smaller space. If you’re standing on-axis, this masks the mass-breakpoint effect

However, in modern line arrays, constant directivity horns are used, and the results more generally resemble the red line.

So what is the power-bandwidth response of a complete line array box? Here’s a typical small line array loudspeaker:

WOOFERS: Two 8-inch, sensitivity 95 dB @ 1 watt/l meter, 200 watts max input each
TWEETER: One 3 diaphragm/l.5-inch throat, 80 watts max input
CROSSOVER: 1,500 Hz

Figure 3 takes into account the crossover, the efficiencies and power handling abilities of the three transducers, and mutual loading effects of the two woofers, and the estimated power-bandwidth response of the box. The green line is for the woofers, the red for the compression driver, and the blue is the combined overall result.

Figure 3: Typical box power-bandwidth response.

Surprising, isn’t it? Particularly since we said we needed more high-frequency output, not less. And it gets worse.

Let’s make a line array out of these boxes and estimate the power bandwidth response of the whole array. (Figure 4) We choose a prediction location of 100 feet on axis. The red line is the maximum SPL available from the array, given the maximum input conditions described above. It shows that the array has about 20 dB less “horsepower” at 15 kHz than at low frequencies. Yikes.

Figure 4: Typical line array power-bandwidfh response.

Note that the red line doesn’t show the low-level frequency response. If the array is operating well below (well below) maximum output, the frequency response can be equalized as flat as we want. However, this will require a large high-frequency boost.

There is nothing wrong with providing such a boost, until a loud high-frequency signal - a cymbal crash, for instance - comes along. When such a signal appears, it will trigger the drive limiter. The limiter will reduce the gain of the entire crossover band.

For our example, this means that everything above 1,500 Hz will be attenuated. Thus, high-amplitude high-frequency signals will modulate the whole high-mid band, including vocals and other instruments. That’s a problem. But how bad of a problem?

It would be great if we had some kind of cool loudspeaker technology with vast amounts of high-frequency output, very high fidelity, and low weight, size, and cost. I’ll get back to you on that one…

Factors The Reduce The HF Requirement
We may not need line arrays with a full, high-frequency, power-bandwidth response.

There are three factors that reduce high-frequency demand:

1) The General Nature of Music. The loudness of musical signals is less at high Frequencies. This has been confirmed by several studies done over the last 20 years.

Dennis Bohn, founder and technical architect at Rane Corporation, discusses this issue in an electronics application note (http://www.rane.com/note126.html) about analog long-line drivers. He gives a conservative rule of thumb for high-frequency signal level as flat to 5 kHz, then falling at 6 dB per octave above that.

I will add something to his picture: Peaks. The fact is that a 20-dB headroom for peaks has been the industry standard since the first mixers for broadcast in the 1920s and 30s.

Hence the +4 dBu nominal and +24 dBu maximum level for most pro audio gear. The “10 dB rule” is probably a rule of thumb for rock ‘n’ roll road dogs using limiters to make it as loud as possible.

Over the years, I’ve seen tors of shows where the signal above, say, 7 kHz, has very big peaks. If the sound system is properly set up, its limiters will kick in when it runs out of steam, and the high-frequency peaks will be removed with no damage to the loudspeakers and only a moderate negative effect on the sound.

If the system does have enough high-frequency headroom for limiting not to occur, it will sound airier and more impactful on the high end.

Figure 5 offers two curves - mine and Dennis’s. My curve has more allowance for peaks.

Figure 5: Maximum spectral amplitude of music.

2) Style Of Program Material. Some research studies have shown that maximum high-frequency level depends on the type of program. For musical programs, the usual results have shown that classical music, jazz, and pop have reduced high-frequency requirements.

Rock, however, has a relatively flat maximum output spectrum, and therefore requires a power bandwidth response that remains strong all the way to the top.

3) Compensating Effects Of Human Hearing. In reverberant environments, where the listener is immersed in a sea of sound coming from all directions, the sound waves diffract around the human head in such a way as to produce a net boost of about 9 dB with a peak around 8 kHz.

The human hearing transfer function is described well in a classic Audio Engineering Society (AES) paper by the late Robert B. Schulein of Shure Brothers. Well-written and easy to read, the paper is called In Situ Measurement and Equalization of Sound Reproduction Systems.

Published in April 1975, it describes a series of experiments that Schulein did to determine the reasons for the so-called “house curve” (a.k.a., “X-curve”), a 3 dB per octave roll-off above about 1,500 Hz that is traditionally applied to cinema and auditorium sound systems.

The paper identifies the role of head diffraction in determining the listening experience in reverberant environments. It’s an important piece, and I think that everyone involved in designing and tuning large sound systems should read it. It’s available for download ($5 for AES members, $20 otherwise) through the AES website.

Figure 6 illustrates the curve that Schulein derived. It shows the transfer function of the ear when the head is immersed in a pure reverberant sound field, where the sound is coming equally from all directions. Such fields are found at most of the seats in an arena concert.

Figure 6: Transfer function of ears in reverberant field.

As you can see, the ear helps the tweeters out quite a bit in reverberant environments. On the other hand, in non-reverberant (anechoic) environments - outdoors, for example - this curve does not apply, and the tweeters have to do all their own work.

Effective Power-Bandwidth Response
To understand the array’s performance in an actual gig, we combine the ear’s transfer function with the array’s power bandwidth response. The result is a curve that I will call the Effective Power-Bandwidth Response (EPBR).

The EPBR is the perceived maximum sound pressure that the array can deliver. “Perceived” means that the curve is adjusted for the ear’s transfer function.

As we’ve seen, the ear’s transfer function is different in reverberant and non-reverberant venues, so we have to handle those two cases separately, None of these CUIVes include any effects of air attenuation. If air attenuation were to be included, it would make things look even worse!

Figure 7 shows the EPBR for reverberant (indoor) and nonreverberant (outdoor) venues.  Did you ever hear a line array in an arena gig and wonder why it sounded pretty good, but only up to about 7-8 kHz? I certainly have. Now we can start to see why.

Figure 7: Effective power-bandwidth response in reverberant venue.

As you can see, for outdoor shows the problem is severe. Who hasn’t been to an outdoor show where everybody was blaming the wind for the lack of high end, or the heat, or the humidity. Clearly, they’re not the only issue.

Effective Headroom
To see how the EPBR affects a particular show, we must compare it with the spectral content of the audio program.

I define the term Effective Headroom (EH) as the difference between the EPB and the spectral amplitude curve of the program material.

As we saw, the estimated spectral content curve depends on the type of program material and on who’s doing the estimating.

Indoors
The best-case scenarios use Dennis Bohn’s spectral content estimate, the worst-case ones use mine.

Here they are in Figure 8 for reverberant venues. This isn’t too bad—best-case headroom is only 5 dB down at 20 kHz.

Figure 8: Effective headroom in reverberant (indoor) venue.

If it’s well run, the rig would lose openness and airiness at high levels, but otherwise would probably sound OK. However, an intense high-SPL high frequency experience would not be available.

Outdoors
Figure 9 shows EH for non-reverberant venues. These curves say that things will sound quite dull whenever the rig is working hard.

Figure 9: Effective headroom in non-reverberant (outdoor) venue.

What Does It All Mean?
Most of today’s line arrays are using an awful lot of their compression driver muscle to reproduce high-frequency harmonics.

If the act is loud, this can often mean that high-frequency transients send the sound system into limiting, with less than perfect sonic results.

Mix engineers can (and should) control this phenomenon by adding compression arid/or limiting to signals from high-frequency-rich instruments. Sometimes that happens, sometimes not.

If you want a lot of air and high-frequency impact at high levels, you’re going to need a larger line array than you might have otherwise thought. And if you’re outdoors, you’re going to need an even bigger one. The interesting detail here is that the scarce resource isn’t bass it’s treble.

Today, most manufacturers are aware of these issues, and are fielding line array boxes with greater and greater high-frequency output. Compression drivers are getting better. The best 3-way systems with compression drivers now have very good fidelity in the very high frequency ranges, but the power bandwidth is still considerably below ideal.

Tweeters increase high-frequency output and fidelity, but the problem of incorporating them seamlessly into line array box design is a difficult one because of the small wavelengths involved. All in all, there’s still some distance to go.

What Can You Do About It Right Now?
Here are a few suggestions for fielding line array systems with optimum high-frequency performance:

1) Evaluate vertical power density in the 6,000 to 15,000 Hz band. By “vertical power density,” I mean output per vertical foot of array height. A very rough way to do this is to look at the size and number of compression drivers per vertical foot.

In this approach, you can count a 1.5-inch throat driver as approximately equivalent to 2.25 1-inch throat drivers, Warning: this is a very approximate technique.

Also, take a look at the midrange crossover frequency of the compression driver. If it’s on the low side, then there will be a lot of competition for the driver’s output power, and drive limiting and intermodulation distortion will tend to be more prevalent. For typical 1.5-inch throat compression drivers, I would consider any crossover frequency below 1,200 Hz to be low; for 1-Inch throat drivers, the frequency is higher - 1,500 to 2,200 Hz - depending on construction.

If the product has separate tweeters, the mid-range drivers will not be struggling to reproduce very high frequencies, and the mid-range crossover frequency can safely be lower.

2) Because of the line array transfer function shown in Figure 1, you will need to have an overall system equalization curve that I call the “ski hill” - a long rising ramp that ends up 12 to 18 dB higher (!) at 10 kHz than at 100 Hz.

Some of this equalization may come from the boxes themselves - many line array boxes are designed with a rising frequency response - but in almost every case, you will still need a pre-crossover drive equalizer capable of creating a long gentle slope with no phase problems.

Three to four sections of parametric EQ will do it. A graphic equalizer might do it, if you use one that’s good at band-to-band summation (there are a few of these). If you had a 1960s hi-fi amp with bass and treble controls, that would work, too.

In addition, think carefully about high-frequency limiting. If you’re familiar with multi-band limiters, they can be a great help, hut they’re tricky to set up, and good devices are expensive. If you’re an advanced user of one of the reconfigurable multi-channel DSP processor products, you might be able to set it up as a crossover with multiband output limiting.

3) Use the longest line arrays you can. There are several reasons for this:

—Longer line arrays tend to be less curved. The less a line array is curved, the less it attenuates the high frequencies (i.e., the flatter the array frequency response curve is). Also, smaller interbox angles usually lead to smoother high-frequency response, so that the treble you do get sounds better.

—Longer line arrays means more compression drivers, which means there is simply more high-frequency “muscle” in the room.

—Longer line arrays are much better-behaved in the mid-bass. This point has nothing to do with the subject of this article, but it’s such an important principle that I didn’t want it to go overlooked.

—If the mix doesn’t appear to have much control over high-amplitude high-frequency transients, consider having a discussion with the mix engineer (and if you’re the mix engineer, have a discussion with yourself!) to see whether limiting and/or compression could be inserted in the relevant mix channels.

In Closing
The array frequency response problems we’ve described here are not unique to line arrays. Whenever you hang up any group of loudspeaker boxes, they will always tend to have higher summed output level at low frequencies. The line array problem is, however, more straightforward to predict and deal with.

The future will bring improved transducer technology to provide the output we need to provide a full, high-Ievel, very-high-frequency listening experience out to a distance of 200 to 250 feet, with an acceptable, although lower, experience to 300 feet.

Beyond that distance, air propagation losses are very significant, and additional delay clusters will continue to be required for full fidelity.

Jeff Berryman served as the director of Jasonaudio, a touring sound company based in Canada, and is now a senior scientist with Electro-Voice.

{extended}
Posted by Keith Clark on 02/12 at 11:39 AM
Live SoundFeaturePollStudy HallLine ArrayLoudspeakerMeasurementSound ReinforcementPermalink

Friday, February 10, 2012

Church Sound Files: The Fallacy Of A “Flat” System

Flat is great for home stereos, headphones and interstate highways, but sometimes not so much for church sound systems

On more than one occasion, I’ve been called to a church to inspect the facility for the purpose of designing a new sound system. Upon arrival, I discover that the church has perfectly adequate audio components that have been tuned and balanced perfectly inadequately.

It never fails - when I inform the owner that the system is fine and simply needs to be properly tuned, I get the response that the church has just paid somebody several hundred dollars to tune the system with pink noise and a computer, and thus, it is as close to “flat” as possible.

But these clients have been sold a fallacy: flat is always good, and flat is what you want.

Not so fast, and here’s why. About 40 years ago, home stereo systems began to improve exponentially. The ability of loudspeakers to exactly reproduce what was happening in the studio recording became extremely good, and providers of high-end stereo system equipment began bragging that the loudspeakers had nearly perfectly flat response. They were indicating to the potential buyer that the loudspeakers were going to exactly reproduce what the studio engineer had worked so diligently to create in the recording studio.

Without question, a stereo system with a perfectly flat response over the entire audio spectrum is indeed a wonderful thing to experience. So why is “flat” not the end-all and be-all?

As previously mentioned, the recording and mastering engineers go to great lengths to EQ every single track on a project to sonic perfection. These guys are true professionals at producing breathtaking audio, and they don’t want your home loudspeakers messing with their art. Therefore, a perfectly flat loudspeaker response should in theory reproduce a sonically perfect example of the original work. 

Because many of us enjoy listening to music in the home (and some pursue sonic perfection with great zeal), it has become widely known that flat is good. Flat is desirable. Flat is what we want.

Let’s get back to our church. The technician we speak to about tuning the room tells us that he will use a computer analysis tool to determine which frequencies in the spectrum are deficient, which ones are too prevalent, and which are just right. He’ll reduce the overly prevalent and increase the deficient, until all frequencies in the audio spectrum are represented at the same decibel level. The system response is now flat. Oh good! This Sunday the sound is going to be awesome!!

But Sunday comes along and the sound is very thick, muffled and somewhat dull. Disappointment city. Why?

Because flat usually only sounds good if you’re playing a recording through it. Recall that the studio engineers went to great lengths to sonically shape the sound.

If your church is fortunate enough to have a quality console with lots of sweepable EQ on each channel and an engineer that really knows how to listen and how to mix, each individual channel can be tweaked to sound fabulous (just like they do in the studio).

On the other hand, if your console is closer to entry level and your Sunday mix engineer is a volunteer who is helpful and willing but perhaps lacks adequate experience, then it’s time to rethink the “flat” idea.

I own and use a spectrum analyzer, but the work doesn’t start there. It ends there. 

First I get rid of feedback using the system EQ. Then I get rid of lingering overtones, and then I do some tonal shaping (again, using the system EQ). 

How do I do this? By listening first (see my prior article about EQ). When I’ve got the system sounding as good as possible, I then fire up the Real Time Analyzer (RTA) to see what it looks like. I may find a part of the spectrum that is lower than it should be and will tweak it up a bit. Almost always, with very few exceptions, the midrange has to be reduced in relation to the lows and highs because the human ear hears mids more readily, and we need to compensate for that.

Take a listen to a quality recording of a singer you like. Notice how crisp and breathy - yet rich and full - that singer sounds.  It doesn’t take a lot of listening to realize that the magical voice you have grown to love has been carefully EQ’d. We have to do the same thing with our sound systems in order to achieve that musical quality, both in speaking and in singing (and of course in all the other sound sources one can find in a church).

Technicians who tune systems by listening sometimes get a bad rap in the the pro audio industry. Computers and analysis programs are wonderful tools that help us in the field to achieve better results. There is, however, no computer that will tell you that if the sound is nasal you should reduce 1 kHz just a bit, or that if the sound is too boomy you should get rid of some level at 100 Hz.

If a technician tells you they tune by ear, ask for some references. You may be very pleasantly surprised. The system will be tuned nowhere near flat, but hopefully it will be crisp and articulate without being piercing, as well as rich, warm and full without being muddy or boomy.

Flat is great for home stereos, headphones and interstate highways. In our church sound systems, sometimes it’s better with some valleys and hills.

Jon Baumgartner is a veteran system designer for Sound Solutions in Eastern Iowa, a pro audio engineering/contracting division of West Music Company. Feel free to e-mail him with your questions at .(JavaScript must be enabled to view this email address).

More Church Sound Basics articles by Jon Baumgartner on PSW:
Stage Monitoring & Keeping Those Performers Smiling
“1,000 Watts” Isn’t Necessarily 1,000 Watts By Some Standards
Graphic Equalization Can Make A World Of Difference
Using Compression To Benefit Overall Sound Quality
Locating Your Loudspeakers & Related Issues
Proper Console Gain Structure, Maximizing Signal-To-Noise Ratio

{extended}
Posted by Keith Clark on 02/10 at 02:48 PM
Church SoundFeaturePollStudy HallConsolesInstallationLoudspeakerMeasurementProcessorSound ReinforcementPermalink

Rational Acoustics Introduces Noise Stick II Pink Noise Generator

Rational Acoustics has introduced a new version of its phantom powered pink noise generator.

Called the Noise Stick II, it offers a 25 dB increase in sensitivity from a device that is the same size as the original, and also has new, blue LED on the endcap that provides visual confirmation that the unit is seeing phantom power and thus, operational.

The Noise Stick II is hand-built in the USA exclusively for Rational Acoustics, outfitted with hand-matched transistors and 1-percent tolerance resistors.

A Linear Feedback Shift Register (LFSR) random number generator provides a pseudorandom signal with excellent statistical properties and without any obvious, audible repeats. The output of the random number generator drives a hand-tuned filter bank that delivers a pink spectrum that is nominally flat within +/- 0.5 dB from 30 Hz to 20 kHz on a 1/3-octave analyzer.

The housing for Noise Stick II is made from sturdy machined anodized aluminum to protect the electronics from shock and increase its crush strength. Prototype test units were found to work just fine after being dropped from two stories onto a gravel surface and then run over by a car (the company does not recommend this treatment, however).

Specifications:
• Output Impedence:

<30 Ohms, Balanced
• Output Connector: Integrated 3-Pin XLR Male
• Output Level: -33 dBu nominal
• Frequency Range: 20 Hz - 20 kHz
• Typical Flatness: +/- 0.5 dB from 30 Hz to 20 kHz
• Power Requirements: 12-48 volt Phantom Power
• Power Consumption: 4 milliamps
• Size: 5.7-in L x 0.78-in W (144.78 mm x 19.8 mm)
• Weight: 1.75 oz. (50 g)
• Case Material: Machined anondized aluminum
• Warranty: One (1) year from date of purchase for defects in materials or workmanship

Rational Acoustics

{extended}
Posted by Keith Clark on 02/10 at 02:47 PM
AVLive SoundChurch SoundNewsPollMeasurementSound ReinforcementPermalink

Thursday, February 09, 2012

Rational Acoustics Appoints NMK Electronics As Middle East Distributor

Rational Acoustics has appointed NMK Electronics as the exclusive distributors for all Rational Acoustics and Smaart branded products for the Middle East. 

NMK’s territory encompasses the United Arab Emirates, Qatar, Oman, Bahrain, Kuwait, Kingdom of Saudi Arabia, Jordan and Lebanon.

Founded in 1984, NMK is a leading distributor of professional audio, video and communication equipment in the Middle East, with a product portfolio including Midas, Klark Teknik, TC Electronic, Dynaudio and Shure

‘We are excited to have NMK Electronics representing Rational Acoustics and Smaart in the Middle East,’ says Karen Anderson, Rational Acoustics chief operating officer. ‘NMK has a well-earned reputation for technical excellence and a dedication to quality products and customer support. They also understand and share our passion for education, which is vital for the support of a product like Smaart.’

‘Our partnership with Rational Acoustics serves two purposes,’ adds NMK business development manager Chicco Hiranandani. ‘First, we are a professional audio distribution company and many of our clients either use or need the Smaart software to accurately calibrate their PA systems. Second, we place a very high emphasis on training, and Rational Acoustics has a strong reputation for its Smaart training classes. We are excited with the opportunities that this new partnership will bring to the industry.’

NMK assumed distribution of Rational Acoustics and Smaart products effective February 1, 2012 and will be presenting its first Smaart Training classes in the Middle East within the next few months.

Rational Acoustics

{extended}
Posted by Keith Clark on 02/09 at 09:56 AM
AVLive SoundChurch SoundNewsPollAVAudioBusinessManufacturerMeasurementPermalink

Monday, February 06, 2012

PreSonus Adds New Control Options To StudioLive Mixers

PreSonus has announced new updates to its StudioLive Series digital mixers, including a number of features not found on any other digital mixer from any manufacturer. 

New features include:

QMix. Up to 10 musicians can simultaneously control their PreSonus StudioLive monitor (aux) mixes using an iPhone or iPod touch and PreSonus’ QMix app, a free download from the Apple App Store. QMix/VSL is the only solution that allows multiple users to each control their own aux from separate iPhones.

Smaart Engine Technology. PreSonus has begun incorporating Rational Acoustics Smaart Measurement Technology for sound-system analysis and optimization directly into PreSonus Virtual StudioLive remote-control/editor/librarian software.

With Smaart technology and VSL, you’ll be able to precisely identify nasty feedback frequencies and get your loudspeakers to play nicer with the room-all without having a degree in acoustical engineering.

The first version of VSL to incorporate Smaart technology will be part of PreSonus Universal Control 1.6, which is expected to be available later this spring.

Universal Control 1.5.3 and StudioLive Remote 1.2. Universal Control 1.5.3 features an improved version of Virtual StudioLive that supports the new QMix iPhone app, including QMix permissions (so that each user controls only one specified aux mix) and the ability to name aux buses.

Universal Control 1.5.3 also adds VSL features that work with PreSonus StudioLive Remote 1.2 for iPad to enable SL Remote permissions so that iPad users can only control front-of-house mixer features or a specified aux. Tap tempo has been added to both VSL and StudioLive Remote.

VSL adds the ability to copy and load channels, copy main mix to aux mix (and aux to aux), link channel faders so that they can move together, and make your StudioLive mixer default to Fader Locate Mode once a fader has been adjusted in VSL or in StudioLive Remote for iPad.

PreSonus

{extended}
Posted by Keith Clark on 02/06 at 08:04 AM
AVLive SoundChurch SoundNewsPollProductConsolesDigitalMeasurementSoftwareSound ReinforcementPermalink
Page 3 of 19 pages  <  1 2 3 4 5 >  Last »