ProSoundWeb.com - Click to return to PSW Home
 

Translate PSW!

 

Native language and speech
intelligibility problems

Go To Page

1 2 3
Go To PageGo To Page

Several examples of this phoneme disconnect are outlined in Fry’s “Homo Loquens” book:

1) “English uses the difference between /s/ and /sh/ as a way of distinguishing words, so that we can find pairs like save and shave, sin and shin, mass and mash. The phonemic system of Dutch or of Spanish or of a number of Indian languages does not include this distinction and as a consequence native speakers of these languages are quite unable to perceive the difference in English unless they have made a special effort to learn to do so.” (page 15)

2) “Another example is to be found in the final sounds of win and wing which are indistinguishable to a native Italian speaker…. his language [also] contains no pair of words differentiated solely by the presence of either /n/ or /ng/.” (page 15)

3) “The phonemic system in quite a number of Indian languages includes as many as six different t sounds which are all but indistinguishable to the English ear. Among them is a pair which differ from each other in the same way as the t sounds in the two English words tar and star, but this is not a difference that has any function in the English system and we are therefore unaware of its existence.” (pages 15-16)

4) “…the fact that Japanese speakers cannot detect the difference between /r/ and /l/ sounds and cannot make the distinction when talking English. A rather endearing example is that of the Japanese who when making an after-dinner speech in English confessed that he was rather nervous and ‘had butterfries in his stomach’.” (page 72)

It is apparent that from the perspective of the person speaking there is no difference between the distinct phoneme sounds. As a consequence he/she feels phoneme sounds can be used interchangeably.

This same phenomenon can be seen in the early language developmental stages of children speaking their native language. During this developmental stage children cultivate the ability to distinguish and enunciate various phonemes.

Often a child uses dissimilar phonemes interchangeably without distinction. In these situations the child exclusively employs the more easily pronounced phoneme. When the child then hears the phoneme pronounced correctly, they typically insist that this is exactly how they had said it.

At this point they have not developed the ability to distinguish between two dissimilar phonemes. This process is much the same as that of learning a foreign language.

Once upon walking through a wooded area with a child of three, he informed me that there was a really big wock (rock) off to the right. Jokingly I responded, “Yes, that is a really big wock.” At which point young William informed me that obviously I had difficulty with the pronunciation of that word, for I had said it incorrectly.

I have also heard the story of a child who requested that an individual “keep quiet because the baby is sweeping (sleeping).” He replied, “Oh, the baby is sweeping?” She looked at him puzzled and stated emphatically, “Not sweeping: sweeping!”

This language development pattern serves in some form as reinforcement of the fact that the phoneme system is complex. It is learned gradually and mastered through every day usage. The typical inability to have multiple primary languages results in a situation where phoneme variations in languages are difficult to interpret due to their abundance.

“People who have a common language have learned to adopt a particular system and moving to another language means acquiring a new and additional system of phonemic organization.” (“Homo Loquens”, page 16)

This can be difficult. It is also unclear whether or not mastering phonemic pronunciation in a language guarantees phonemic comprehension. Perhaps differentiating spoken phonemes is more difficult than actually speaking them. This hypothesis is somewhat supported by Professor Campbell’s experience with fluent non-native English speakers.

There are additional criteria that lend themselves to word comprehension. Intonation and rhythm can dramatically affect the meaning that is being conveyed by the speaker.

“The various intonations that can be given to a sentence are themselves part of the grammar of the spoken language and the information about the intonation system is another component in the linguistic knowledge stored by the brain.” (Homo Loquens, page 16)

But these variations are typically less language dependent and are fewer in number than the differing phonemes.

“So much emphasis has been placed on the phoneme level of operation because this is where the main ear-work of speech takes place. [intonation and rhythm, while important to comprehension, involve a significantly smaller number of categories]…the English system, for example, functions with six tones and only two rhythmic categories, formed by the strong syllables and the weaker ones.” (Homo Loquens, page 72)

All of this is to say nothing of the tremendous differences in sentence construction between various languages that can add to or detract from one’s ability to achieve comprehension from context. Simple things like adjectives preceding or following nouns can severely obstruct ones ability to gather meaning.

In essence, there are several logical explanations that describe the perceived inability of non-native speakers to comprehend a familiar language, particularly when spoken in a noisy environment.

Speech Intelligibility Derivations
The goal in developing a good speech transmission system is to determine what conditions are necessary for the maximum intelligibility. This intelligibility “… is used to signify the accuracy and ease with which the articulated sounds of speech are recognized.” (Olson, page 495)

The criteria used to determine the effectiveness of this speech transmission system are intelligibility indices that are based upon signal and noise levels over specified bandwidths. The fundamental methods used to determine intelligibility involve “… pronouncing speech sounds into one end of a transmission system and having the observer write the sounds that are heard at the receiving end.” (Olson, page 495)

“According to the work of French and Steinberg and of Beranek, if the spectrum levels of speech at a listener’s ear are such that the shaded region of lies above the threshold of hearing of the listener and above the ambient noise, but below the overload line, all syllables of the speech will be audible to the listener and the speech intelligibility will be nearly perfect. This corresponds to an articulation index of 100 percent….” The percentage articulation index is defined as the ratio (times 100) of the speech area not covered over by [noise, the threshold of hearing, or overload] to the total speech area… (Beranek pp. 408-409)

In order to calculate these quantities for a theoretical system, the gain of the system, coupled with the directivity index of the amplification system and the reverberant characteristics of the space can be used to determine the average, peak and minimum levels of speech and noise in a given space.

The problem here is that the tests that were used to arrive at these conclusions involved native speakers of English listening to native speakers of English. While Beranek and others recognize the significance of “psychological and linguistic” factors as they relate to different native speakers, different word lengths and trained or untrained listeners all of which yield dramatically different articulation results make “absolute predictions of articulation scores … not possible”. The contention remains that “one can say that if the calculated articulation index exceeds 60 per cent, a speech-communication system is probably satisfactory.” (Beranek, p. 415)

Much of the basis for the additional indices such as STI (a general purpose speech intelligibility index based upon SNR and reverberation), RASTI (similar to STI but requiring less data) and %ALCons (the percentage of consonants that will be detected clearly which is paramount to comprehension) has evolved from these early studies into speech intelligibility.

More recent social and technological changes require that additional steps be taken to ensure public safety. The original conclusions are all based upon the fallacy that the vast majority of speech takes place between native speakers of a common language. This assumes that resulting indices will suffice for all communication.

This was perhaps true at some point in the past. As technology expands and the world in essence shrinks, diverse language histories will frequently come in contact with one another. The simple experiments conducted by Professor Campbell indicate that non-native speaking students in one controlled environment correctly identified less than 40 percent of the words correctly. Clearly additional work must be done in this area.

Previous Page

Email this story to a friend.

Next Page



© copyright 2004 ProSoundWeb.com
PO Box 28, 99 Church Street, Whitinsville, MA 01588
Voice: 508.234.8832   Fax: 508.234.8870
Send comments about this site to webmaster@prosoundweb.com
This site is best viewed in IE 5.0 or Netscape 6.0 or higher.