Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition

Johnson, Eric Martin

Abstract Details

2022, Doctor of Philosophy, Ohio State University, Speech and Hearing Science.
The three manuscripts presented here examine concepts related to speech perception in noise and ways to overcome poor speech intelligibility without depriving listeners of environmental sound recognition. Because of hearing-impaired (HI) listeners’ auditory deficits, there is a substantial need for speech-enhancement (noise reduction) technology. Recent advancements in deep learning have resulted in algorithms that significantly improve the intelligibility of speech in noise, but in order to be suitable for real-world applications such as hearing aids and cochlear implants, these algorithms must be causal, talker independent, corpus independent, and noise independent. Manuscript 1 involves human-subjects testing of a novel, time-domain-based algorithm that fulfills these fundamental requirements. Algorithm processing resulted in significant intelligibility improvements for both HI and normal-hearing (NH) listener groups in each signal-to-noise ratio (SNR) and noise type tested. In Manuscript 2, the range of speech-to-background ratios (SBRs) over which NH and HI listeners can accurately perform both speech and environmental recognition was determined. Separate groups of NH listeners were tested in conditions of selective and divided attention. A single group of HI listeners was tested in the divided attention experiment. Psychometric functions were generated for each listener group and task type. It was found that both NH and HI listeners are capable of high speech intelligibility and high environmental sound recognition over a range of speech-to-background ratios. The range and location of optimal speech-to-background ratios differed across NH and HI listeners. The optimal speech-to-background ratio also depended on the type of environmental sound present. Conventional deep-learning algorithms for speech enhancement target maximum intelligibly by removing as much noise as possible while maintaining the essential characteristics of the target speech signal. Manuscript 3 tests a new form of time-frequency masking that is designed to leave a small amount of background noise intact. The purpose of the unremoved background noise is to allow for environmental sound awareness while still providing significantly increased intelligibility. It was found that this type of processing resulted in significantly improved intelligibility and high environmental sound recognition performance for both types of listeners. It was also found that the same level of maximum attenuation provided the optimal balance of intelligibility and environmental sound recognition for both listener types.
Eric Healy (Advisor)
Rachael Holt (Committee Member)
DeLiang Wang (Committee Member)
168 p.

Recommended Citations

Citations

  • Johnson, E. M. (2022). Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1658358399286129

    APA Style (7th edition)

  • Johnson, Eric. Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition. 2022. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1658358399286129.

    MLA Style (8th edition)

  • Johnson, Eric. "Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition." Doctoral dissertation, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu1658358399286129

    Chicago Manual of Style (17th edition)