Tech Topic

Speech Intelligibility Benefits of FaceTime

Published on

Tech Topic | February 2015 Hearing Review

Speech recognition testing in this study showed that streaming the speech, instead of delivering speech acoustically from a phone, improves speech understanding significantly—even when only streamed to one hearing instrument and with background noise being amplified through both hearing instrument microphones. Streaming the speech to the hearing instruments also provided a significant additional speech understanding benefit, as did adding visual information through FaceTime and extended frequency bandwidth.

Everyone has experienced how difficult it can be to carry out a phone conversation in a noisy situation. For hearing instrument wearers, using the phone can be very challenging even in quiet situations and impossible in the presence of any background noise. While the ability to use the phone may seem of secondary importance, phone use is an important communication medium to which everyone should have access. In fact, phone use has been linked to self-reported quality of life.1

Numerous factors contribute to hearing instrument wearers’ experienced difficulties communicating on the phone. These include presence of background noise, inappropriate or inadequate coupling to the phone, handset positioning difficulties or constraints, and absence of visual cues.

Many hearing instruments have the option to operate in one of three modes for phone conversations: Acoustic phone, inductive coupling (telecoil), and digital wireless coupling.

Acoustic mode. Hearing instruments operating in acoustic mode (holding the phone receiver to the hearing instrument microphone) receive and amplify all sounds surrounding the hearing instrument wearer. Sounds amplified in this mode include the phone’s audio signal—typically conversational speech—as well as ambient sounds.

Telecoil. Hearing instruments operating via inductive coupling (telecoil) receive signals from magnetic fields generated by telecoil-compatible phones. Inductive coupling can be inductive coupling only, in which amplification of ambient sounds is avoided by turning off the hearing instrument microphones in the inductive coupling program (telecoil program). Inductive coupling can also be a combined inductive coupling and hearing instrument microphone mode (microphone-and-telecoil program) allowing the phone signal to be delivered through the inductive coupling while ambient sounds are being amplified through the hearing instrument microphones.

Wireless streaming. The third option for phone use is streaming either via a hearing aid accessory or directly from an iPhone. When streaming through a hearing instrument accessory, the accessory is paired to a Bluetooth compatible phone. When using the phone, the sound streams from the phone to the accessory, via Bluetooth, to the hearing instruments via another digital wireless technology.

It is also possible to eliminate the accessory with direct streaming from the iPhone to the hearing instruments. This requires Made for iPhone (MFi) hearing instruments that incorporate specific wireless technology based on radio frequency in the 2.4 GHz band. In the case of ReSound devices, these hearing instruments communicate directly to the iPhone via Bluetooth SmartTM technology, without the need for any intermediary device or hearing instrument accessory.

As with inductive coupling, streaming from the phone can be done with and without the hearing instrument microphones activated, and thereby allow either amplification of or removal of ambient sounds. Inductive coupling and streaming improves the signal-to-noise ratio (SNR) by streaming the phone signal (typically speech). Deactivation of the hearing instrument microphones may be needed for some in order to be able to carry out a phone conversation, whereas activation of the hearing instrument microphones might be preferred by others to allow awareness of the environmental sounds around them.

Streaming, whether via an accessory or directly to the MFi hearing instruments, removes the reliance on correct placement of the phone handset. Streaming also allows the listener access to the phone conversation in both ears, which has been shown to provide significant benefit even in the presence of several different noise configurations. This benefit has been attributed to binaural summation (or binaural redundancy), and binaural squelch.2

It is estimated that visual information makes up approximately two-thirds of all communication.3 Most of us are probably unaware of our usage of visual cues in many situations, but become aware of them when in challenging communication situations like in background noise. The absence of visual cues might be the reason why people with normal hearing periodically experience difficulties hearing on the phone when in background noise. The addition of visual cues is also helpful for people with hearing impairment. In fact, those with very severe losses rely as much or more on visual information as on auditory information.4,5

One of the advances that have come with smartphones and tablets is the possibility to carry out video calls. This means that the camera on the smart device can be used to pick up and transmit an animated image of the face of the caller at the same time he or she is talking. In this way, the call recipient can both see and hear the caller.

Many different apps exist that can be downloaded to smart devices for video calling. They require that both the caller and call recipient are using the same app. Assuming that the same benefit of visual cues is available on a video call, this technology could provide an important benefit for hearing instrument users. In particular, for those with severe-to-profound hearing losses, this technology could make the difference between successful phone use and no phone conversation at all.

So, with all these advances within technology, how much benefit can the hearing-impaired individual expect? This article outlines a study conducted in controlled laboratory conditions that aimed at quantifying the levels of end-user benefit that can be expected when using the phone in the presence of background noise.

Study Methods

JespersenFig

[Click on images to enlarge.] Figure 1. Average audiogram for right and left ear with range for the 15 test participants (30 ears).

Test participants. A total of 15 individuals (10 male, 5 female) with severe-to-profound hearing impairment participated in the test (Figure 1). Their median age was 77 years (1st quartile 69, and 3rd quartile: 79) and they had a median of 32 years of experience with amplification (1st quartile: 25, and 3rd quartile: 39). The test participants’ unaided discrimination score in noise varied a lot. One test participant had an unaided discrimination score of 40% on the right ear and of 20% on the left ear while another test participant showed an unaided discrimination score of 100% and 88% on the right and left ears, respectively. One test participant had an unaided discrimination score of 84% on the right ear and of 0% on the left ear. It was not possible to measure unaided discrimination score in two test participants due to their degree of hearing loss and limitations on the output level of the test.

Test setup. Testing was performed in a sound-treated room with the test participant placed in the middle of a 4-loudspeaker setup (speakers at 0°, 90°, 180°, and 270° azimuth). Speech was presented through 1 of 9 different phone handling options while background noise was presented simultaneously from the four loudspeakers simulating a difficult phone listening situation. When applicable, an iPhone 5s was used for presenting visual information using the Apple FaceTime application (or “app”).

JespersenFig2

Figure 2. Illustration of the 9 different test conditions.

FaceTime is a videocalling app specifically for Apple products.

The following listening conditions were included in the study (Figure 2):

  • Acoustic Phone (acoustic coupling)
    1) Unilateral
  • Phone Clip + (Bluetooth coupling)
    2) Unilateral without FaceTime
    3) Unilateral with FaceTime
    4) Bilateral without FaceTime
    5) Bilateral with FaceTime
  • Made for iPhone (MFi) direct streaming from iPhone to hearing aids (Bluetooth coupling)
    6) Unilateral without FaceTime
    7) Unilateral with FaceTime
    8) Bilateral without FaceTime
    9) Bilateral with FaceTime

Speech was delivered through the landline phone receiver (Geemarc Amplipower 40) coupled to the iPhone 5s headphone jack via a phone patch (Rolls) enabling the iPhone to also play speech through the acoustic phone for the acoustic phone condition. Prior to the test, each test participant was helped to find the ideal phone receiver placement by playing a constant signal through the phone receiver. Speech was streamed via the ReSound Unite Phone Clip+ wireless phone accessory and directly to the hearing instruments through the iPhone 5s. The Phone Clip+ was hung around the test participant’s neck using the standard lanyard. Speech was streamed directly from the iPhone 5s to the MFi hearing instruments. The iPhone 5s was positioned at a comfortable viewing distance using a microphone boom.

All participants were fitted with ReSound ENZO super-power behind-the-ear (BTE) hearing instruments bilaterally using their own earmolds. The hearing instrument microphones on both ears were activated in all test conditions. The presentation level for each test condition was calibrated in a 2cc coupler in order to avoid gain differences affecting the speech intelligibility results. The test hearing instruments were fitted according to the ReSound proprietary gain prescription rule Audiogram+, and gain settings were identical for all test conditions.

Materials. Testing was done using the Dantale I Danish audiovisual speech materials. Eight 25-word adult word lists and four 20-word pediatric lists were used. The speech was presented in a constant amplitude-modulated, speech-shaped background noise.6 The test participants’ unaided discrimination score in noise was tested using the four pediatric lists in order to avoid them becoming familiar with the adult word lists. The adult word lists were used for the aided testing.

The speech material presented in the audio-only conditions was band-pass filtered to match the frequency response of a telephone transmission (ie, 300-3400 Hz). The audiovisual material was not band-pass filtered, as the FaceTime audio bandwidth using an Internet network does not have the same limitation as a traditional phone network.

Procedure. Each test participant completed testing in the nine different phone conditions. The order of test conditions for each participant was randomized using a Latin square and was completed in two sessions separated by at least 2 weeks. Each session was initiated with two training rounds; one being the audio-only signal, and one being the audiovisual signal. This was done in order to accustom the test participants with both the audio and visual part of the material.

The aim of the test was to obtain a percentage correct score for each test condition.

Speech and noise were presented at a constant SNR level, which was the same for all test conditions. Testing was conducted at a SNR of either 10, 13, 16, or 20 dB. The SNR level at which the participant had approximately 60% correct with the audio-only signal was chosen in order to allow room for the participant’s scores in the other phone conditions to both improve and degrade, and thereby allow room for, demonstrating differences among the test conditions (ie, we were interested in differences among the nine test conditions rather than absolute performance).

Speech in noise testing, including setting the SNR level, was controlled through a specially designed MATLAB graphical user interface.

Results

The results show the percentage of the speech material that the test participants heard correctly in the different phone conditions.

JespersenFig3

Figure 3. Average percent correct speech understanding with Acoustic Phone, Phone Clip+, and MFi phone listening conditions, all unilateral conditions.

Unilateral condition. The average percent correct speech understanding with the acoustic phone, Phone Clip+, and MFi phone conditions in a unilateral condition is shown in Figure 3. A Tukey’s honest significant difference (HSD) reveals that the Phone Clip+ and MFi phone listening conditions provide a significantly better speech understanding than the acoustic phone. The Phone Clip+ and MFi phone handling conditions on average provide 39% additional speech understanding benefit compared to the acoustic phone condition.

In other words, these severe-to-profound hearing-impaired individuals hear significantly better when the speech material is streamed as opposed to when it is delivered acoustically to the hearing instrument, even when only streamed to one hearing instrument and with the background noise being amplified through both hearing instrument microphones.

Bilateral streaming. Speech understanding over the phone was also tested with speech materials streamed to both hearing instruments using the wireless connectivity offerings included in the study. Streaming the speech to both hearing instruments provided an average additional 10% speech understanding benefit. Application of the HSD reveals that this benefit is significant. This significant bilateral benefit applies to both the Phone Clip+ and MFi audio condition, and in both the audio-only and audiovisual condition.

The participants with severe-to-profound hearing impairment, on average, obtain 10% more speech understanding with the audio signal streamed to both hearing instruments, while at the same time having both hearing instrument microphones amplifying the background noise, as opposed to only one instrument. This bilateral benefit is observed irrespective of whether they receive an audio-only or an audiovisual signal streamed to the phone. The test participant with the highest unaided speech discrimination for both ears (100% and 88% on the right and left ear, respectively) obtained the greatest bilateral benefit in speech understanding. The majority of the test participants achieved a bilateral benefit. Only a few test participants did not demonstrate a bilateral benefit in terms of improved speech understanding, with one being the test participant with a discrimination score of 84% on the right ear and of 0% on the left ear.

JespersenFig4

Figure 4. Average percent correct speech understanding for all audio and all audiovisual phone listening conditions.

Audio-only vs audiovisual. Figure 4 shows the average percent correct speech understanding for all audio (both Phone Clip+ and MFi and in both the unilateral and bilateral condition) and all audiovisual (both Phone Clip+ and MFi audiovisual and in both the unilateral and bilateral condition) phone listening conditions. Adding visual information through FaceTime and extended frequency bandwidth on average provided an additional benefit of 23%. This significant visual and extended bandwidth benefit applies to all audio conditions, both Phone Clip+ and MFi and in both the unilateral and bilateral condition.

The participants with severe-to-profound hearing impairment, on average, understand 23% more on the phone when the audio signal is supported by visual information and extended frequency bandwidth, which is the case with FaceTime and also applies for all streaming conditions including the Phone Clip+ and MFi and in both the unilateral and bilateral condition.

When looking at individual benefit, it becomes apparent that the test participant with the most profound hearing loss obtained the largest benefit of added visual cues and extended frequency bandwidth, while the test participant with the least severe hearing loss obtained less added benefit from the visual cues and extended frequency bandwidth. More importantly, however, all test participants achieved additional benefit from the visual cues and the extended frequency bandwidth provided through FaceTime.

JespersenFig5

Figure 5. Average percent correct speech understanding for the unilateral acoustic phone, the audiovisual bilaterally over Phone Clip+ streamed signal and the audiovisual bilaterally directly MFi streamed signal listening conditions.

Further comparisons. Figure 5 summarizes the average percent correct speech understanding for the acoustic phone, Phone Clip+ audiovisual bilateral, and MFi audio- visual bilateral phone handling conditions. The Phone Clip+ with audio streamed to both hearing instruments and visual information through FaceTime, and MFi with audio streamed to both hearing instruments and visuals through FaceTime, provide an average benefit of 72% in speech understanding on the phone for these severe-to-profound hearing-impaired test participants.

The severe-to-profound hearing-impaired test participants on average obtain a speech understanding benefit of 72% over that of using a regular acoustic phone when they make use of having the audio streamed to both hearing instruments and make use of visual cues and extended frequency bandwidth.

Discussion

Speech recognition testing in this study showed that streaming the speech instead of delivering speech acoustically from a phone improves speech understanding significantly, even when only streamed to one hearing instrument and with background noise being amplified through both hearing instrument microphones.

The reason for this improvement is likely twofold. First, placing the phone to optimize pickup of the phone signal during acoustic phone usage is challenging, and maintaining it throughout the testing in this study and during a phone conversation in real life is even more challenging. Second, streaming the speech signal to the hearing instrument provides a speech signal with an improved SNR as compared to the speech signal delivered through an acoustic phone. This is even though testing is conducted in a realistic environment with background noise being amplified through both hearing instrument microphones.

Speech understanding over the phone was also tested with speech materials streamed to both hearing instruments using the wireless connectivity offerings included in the study. Streaming the speech to both hearing instruments as opposed to one hearing instrument alone provided a significant additional speech understanding benefit. This finding is in agreement with other research showing a significant benefit when phone speech was transmitted bilaterally during phone usage in the presence of several different noise configurations. This benefit was attributed to binaural summation (or binaural redundancy) and binaural squelch.2

Adding visual information through FaceTime and extended frequency bandwidth provided an additional speech recognition benefit. The availability of visual cues is known to aid in speech understanding.4,7 The extended frequency bandwidth is expected to add benefit, as well.

Summary

The alignment of advances in hearing instrument and cell phone technology available today provides hearing instrument users with exciting new options for improving speech understanding on the phone. The most significant of these is to stream an improved speech signal instead of delivering it acoustically.

Compared to telecoils, streaming has the added advantage of not being dependent on an optimum positioning of the phone handset in relation to the hearing aid. Streaming also allows bilateral reception of the phone signal.

As demonstrated in this study, speech to two hearing instruments instead of one provides improved audibility, probably due to loudness summation. Finally, video calling adds visual cues and extended frequency bandwidth.

The addition of visual cues via FaceTime and extended bandwidth provided improved speech understanding in this study. These options could be beneficial to anyone, regardless of hearing status (and depending on the situation). In particular, they can make a substantial difference for people with severe-to-profound hearing impairment, enabling them to communicate on the phone at all.

Conclusions

  • Phone Clip+ and MFi phone handling strategies provide a significant speech understanding benefit compared to acoustic phone, even in the unilateral condition.
  • A bilateral phone handling strategy provides a significant speech understanding benefit with both wireless connectivity solutions (Phone Clip+ & MFi) and in both the audio only and audiovisual condition.
  • A visual phone handling strategy provides an average significant benefit of 23% in speech understanding for both wireless connectivity solutions (Phone Clip+ and MFi) and in both the unilateral and bilateral condition.

Jespersen&KirkwoodBioBoxReferences

1. Dalton DS, et al. The impact of hearing loss on quality of life in older adults. Gerontologist. 2003;43(5):661-668.
2. Picou EM, Ricketts TA. Comparison of wireless and acoustic hearing aid-based telephone listening strategies. Ear Hear. 2011;32(2):209-220.
3. Gamble TK, Gamble MW. Interpersonal Communication. Thousand Oaks, Calif: SAGE Publications; 2014.
4. Tilberg I, et al. Audio-visual Speechreading in a group of hearing aid users—The effect of onset age, handicap age, and degree of hearing loss. Scand Audiol. 1996;25:268-272.
5. Erber NP. Auditory-visual perception of speech. J Speech Hear Disord. 1975;40(4):481-492.
6. Elberling C, Ludvigsen C, Lyregaard PE. Dantale: A new Danish speech material. Scand Audiol. 1989;18(3):169-175.
7. Tye-Murray N, Sommers MS, Spehar B. Audiovisual integration and lipreading abilities of older adults with normal and impaired hearing. Ear Hear. 2007;28(5):656-668.

Correspondence can be addressed to cjespersen@ gnresound.dk

Citation for this article: Jespersen, CT, Kirkwood, B. Speech Intelligibility Benefits of FaceTime. Hearing Review. 2015;21(2):28.

ReSound

For listings outside North America, please see GN ReSound