Application Note: A Deep Dive Into Smart Loudspeaker Acoustic Measurements

Multitone

As the name implies, a multitone signal is comprised of a series of sine waves (tones) at discrete frequencies combined together. The phase of each tone is often adjusted to reduce the signal’s crest factor (the ratio of its peak to rms level). To measure frequency response, the tones are usually spaced logarithmically in frequency.

In AP audio analyzers, the multitone analysis measurement uses a DSP technique which enables it to measure distortion* and noise simultaneously with frequency response. (*The distortion metric derived from multitone analysis is referred to as “Total Distortion” rather than THD, because there is no way to distinguish between harmonic distortion and intermodulation distortion.)

The APx500 audio analyzer, comprised of measurement software and a Flex Key.

One of the unique features of multitone analysis is that it enables measuring noise in the presence of signal. Advantages of multitone analysis include:

  1. Multitone testing is fast and delivers a large number of audio quality metrics with a single measurement.
  2. Each multitone has a unique “signature,” enabling the analyzer to trigger accurately on the signal when noise or other signals are present.
  3. When a reasonable number of tones are included, the signal becomes different enough from a pure tone that it is unlikely to be blocked by a noise suppression algorithm.

Some disadvantages to multitone analysis are:

  1. Because all the frequencies are stimulated simultaneously, the signal energy level at each frequency is lower than it would be for a pure sine or chirp signal. Stated another way, the crest factor of the signal is much lower than a sinusoid.
  2. Similar to a stepped sine, a multitone typically has limited frequency resolution, and an anechoic chamber is required for acoustic tests.

In a free-field environment, such as an anechoic chamber, multitone analysis works well for testing both the main input path and output path of a smart speaker. One caveat is that on the input side, enough tones should be used that a noise cancellation algorithm will not block the signal. On the output side, from the perspective of the IVA, a multitone signal is just another music track for the device to play.

Transfer Function

Transfer function analysis, sometimes referred to as “dual-channel FFT analysis” or “dynamic signal analysis,” involves stimulating a DUT with a broadband** signal (such as noise, music or speech) and acquiring the output from the DUT. (**We use the term broadband to refer to a signal that contains energy at all frequencies within a certain frequency range.)

The transfer function (i.e., frequency response) is then derived from the input and output signals using a mathematical technique known as the complex Discrete Fourier Transform (DFT). The term “complex” in this context refers to the fact that the frequency response includes both magnitude and phase. A byproduct of the analysis is a result known as the Coherence function – a function with a value between 0 and 1 at each frequency that indicates the degree to which the output signal is coherent with (i.e., related to, or caused by) the input signal.

Most analyzers require that you use one of the analyzer input channels to measure the input to the DUT as well as its output. In the APx500, the stimulus signal can be taken from the generator signal or from a file on disk, freeing up an input channel for additional measurements. The measurement has another feature that is especially useful for open loop measurements: the ability to trigger on the signal of interest using a signal matching algorithm based on cross correlation.

The main advantage of transfer function analysis over other frequency response measurement techniques is that it can use any broadband signal. This makes it an excellent choice for testing the input path of a smart speaker, because a speech or speech-like signal can be used as the stimulus, virtually ensuring that the signal will be acquired and processed unaltered by any noise cancellation algorithms.

Transfer function analysis can also be used for the output path of a smart speaker, using music or noise as a stimulus. An anechoic chamber should also be used when testing both the input path and output path of a smart speaker, to avoid reflections.

REFERENCES
1. Wired.com. Amazon’s weird Siri-like speaker is another way to get you to shop. (2014).
2. The Smart Audio Report, Winter 2018, NPR and Edison Research (2019).
3. Smart Speaker Market, Allied Market Research https://www.alliedmarketresearch.com/smart-speaker-market (2019).
4. The Anatomy, Physiology, and Diagnostics of Smart Audio Devices. AES Convention e-Brief 426 (2018).
5. Microphone Array Beamforming. InvenSense Application Note AN-1140 (2013).
6. Measuring Audio when Clocks Differ. AES Convention Paper 10055, NY (2018).
7. Toole, Floyd (2009). Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms. Taylor and Francis.
8. A. Farina, “Simultaneous measurement of impulse response and distortion with a swept sine technique,” Presented at the 108th AES Convention, Paris, France, 2000.

Go to next page for part 2 of this series.