Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 19)

Mini-Tools

 
 

Search Report

  • 1. Leopold, Sarah Factors Influencing the Prediction of Speech Intelligibility

    Doctor of Philosophy, The Ohio State University, 2016, Speech and Hearing Science

    The three manuscripts presented here examine the relative importance of various `critical bands' of speech, as well as their susceptibility to the corrupting influence of background noise. In the first manuscript, band-importance functions derived using a novel technique are compared to the standard functions given by the Speech Intelligibility Index (ANSI, 1997). The functions derived with the novel technique show a complex `microstructure' not present in previous functions, possibly indicating an increased accuracy of the new method. In the second manuscript, this same technique is used to examine the effects of individual talkers and types of speech material on the shape of the band-importance functions. Results indicate a strong effect of speech material, but a smaller effect of talker. In addition, the use of ten talkers of different genders appears to greatly diminish any effect of individual talker. In the third manuscript, the susceptibility to noise of individual critical bands of speech was determined by systematically varying the signal-to-noise ratio in each band. The signal-to-noise ratio that resulted in a criterion decrement in intelligibility for each band was determined. Results from this study indicate that noise susceptibility is not equal across bands, as has been assumed. Further, noise susceptibility appears to be independent of the relative importance of each band. Implications for future applications of these data are discussed.

    Committee: Eric Healy Ph.D. (Advisor); Rachael Frush Holt Ph.D. (Committee Member); DeLiang Wang Ph.D. (Committee Member) Subjects: Audiology
  • 2. Johnson, Eric Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition

    Doctor of Philosophy, The Ohio State University, 2022, Speech and Hearing Science

    The three manuscripts presented here examine concepts related to speech perception in noise and ways to overcome poor speech intelligibility without depriving listeners of environmental sound recognition. Because of hearing-impaired (HI) listeners' auditory deficits, there is a substantial need for speech-enhancement (noise reduction) technology. Recent advancements in deep learning have resulted in algorithms that significantly improve the intelligibility of speech in noise, but in order to be suitable for real-world applications such as hearing aids and cochlear implants, these algorithms must be causal, talker independent, corpus independent, and noise independent. Manuscript 1 involves human-subjects testing of a novel, time-domain-based algorithm that fulfills these fundamental requirements. Algorithm processing resulted in significant intelligibility improvements for both HI and normal-hearing (NH) listener groups in each signal-to-noise ratio (SNR) and noise type tested. In Manuscript 2, the range of speech-to-background ratios (SBRs) over which NH and HI listeners can accurately perform both speech and environmental recognition was determined. Separate groups of NH listeners were tested in conditions of selective and divided attention. A single group of HI listeners was tested in the divided attention experiment. Psychometric functions were generated for each listener group and task type. It was found that both NH and HI listeners are capable of high speech intelligibility and high environmental sound recognition over a range of speech-to-background ratios. The range and location of optimal speech-to-background ratios differed across NH and HI listeners. The optimal speech-to-background ratio also depended on the type of environmental sound present. Conventional deep-learning algorithms for speech enhancement target maximum intelligibly by removing as much noise as possible while maintaining the essential characteristics of the target speech signal (open full item for complete abstract)

    Committee: Eric Healy (Advisor); Rachael Holt (Committee Member); DeLiang Wang (Committee Member) Subjects: Acoustics; Artificial Intelligence; Audiology; Behavioral Sciences; Communication; Computer Engineering; Health Sciences
  • 3. Zhao, Yan Deep learning methods for reverberant and noisy speech enhancement

    Doctor of Philosophy, The Ohio State University, 2020, Computer Science and Engineering

    In daily listening environments, the speech reaching our ears is commonly corrupted by both room reverberation and background noise. These distortions can be detrimental to speech intelligibility and quality, and also pose a serious problem for many speech-related applications, including automatic speech and speaker recognition. The objective of this dissertation is to enhance speech signals distorted by reverberation and noise, to benefit both human communications and human-machine interaction. Different from traditional signal processing approaches, we employ deep learning approaches to perform reverberant-noisy speech enhancement. Our study starts with speech dereverberation without background noise. Reverberation consists of sound wave reflections from various surfaces in an enclosed space. This means the reverberant signal at any time step includes the damped and delayed past signals. To explore such relationships at different time steps, we utilize a self-attention mechanism as a pre-processing module to produce dynamic representations. With these enhanced representations, we propose a temporal convolutional network (TCN) based speech dereverberation algorithm. Systematic evaluations demonstrate the effectiveness of the proposed algorithm in a wide range of reverberant conditions. Then we propose a deep learning based time-frequency (T-F) masking algorithm to address both reverberation and noise. Specifically, a deep neural network (DNN) is trained to estimate the ideal ratio mask (IRM), in which the anechoic-clean speech is considered as the desired signal. The enhanced speech is obtained by applying the estimated mask to the reverberant-noisy speech. Listening tests show that the proposed algorithm can improve speech intelligibility for hearing-impaired (HI) listeners substantially, and also benefit normal-hearing (NH) listeners. Considering the different natures of reverberation and noise, we propose to perform speech enhancement using a two-stage (open full item for complete abstract)

    Committee: DeLiang Wang (Advisor); Eric Fosler-Lussier (Committee Member); Eric Healy (Committee Member) Subjects: Computer Science; Engineering
  • 4. Vasko, Jordan Speech Intelligibility and Quality Resulting from an Ideal Quantized Mask

    Master of Arts, The Ohio State University, 2017, Speech and Hearing Science

    Speech recognition in noise presents a significant challenge for individuals with hearing loss, and current technologies to remedy this problem are limited. Recently developed machine learning algorithms have, however, proved to be promising solutions to this problem, as they have been able to segregate speech from noise to significantly improve its intelligibility for both normal-hearing and hearing-impaired listeners. The following paper introduces a novel segregation method to be employed by such machine-learning algorithms. The intelligibility and quality of speech-noise mixtures processed via this method were evaluated for normal-hearing listeners. The proposed approach was shown to produce speech intelligibility and quality that were comparable to those produced by the best current technique. Because this approach also has characteristics that may make implementation easier, it potentially represents a better approach than other existing methods.

    Committee: Eric Healy Ph.D. (Advisor); DeLiang Wang Ph.D. (Committee Member); Rachael Frush Holt Ph.D. (Committee Member) Subjects: Acoustics; Audiology
  • 5. Chen, Jitong On Generalization of Supervised Speech Separation

    Doctor of Philosophy, The Ohio State University, 2017, Computer Science and Engineering

    Speech is essential for human communication as it not only delivers messages but also expresses emotions. In reality, speech is often corrupted by background noise and room reverberation. Perceiving speech in low signal-to-noise ratio (SNR) conditions is challenging, especially for hearing-impaired listeners. Therefore, we are motivated to develop speech separation algorithms to improve intelligibility of noisy speech. Given its many applications, such as hearing aids and robust automatic speech recognition (ASR), speech separation has been an important problem in speech processing for decades. Speech separation can be achieved by estimating the ideal binary mask (IBM) or ideal ratio mask (IRM). In a time-frequency (T-F) representation of noisy speech, the IBM preserves speech-dominant T-F units and discards noise-dominant ones. Similarly, the IRM adjusts the gain of each T-F unit to suppress noise. As such, speech separation can be treated as a supervised learning problem where one estimates the ideal mask from noisy speech. Three key components of supervised speech separation are learning machines, acoustic features and training targets. This supervised framework has enabled the treatment of speech separation with powerful learning machines such as deep neural networks (DNNs). For any supervised learning problem, generalization to unseen conditions is critical. This dissertation addresses generalization of supervised speech separation. We first explore acoustic features for supervised speech separation in low SNR conditions. An extensive list of acoustic features is evaluated for IBM estimation. The list includes ASR features, speaker recognition features and speech separation features. In addition, we propose the Multi-Resolution Cochleagram (MRCG) feature to incorporate both local information and broader spectrotemporal contexts. We find that gammatone-domain features, especially the proposed MRCG features, perform well for supervised speech separation at (open full item for complete abstract)

    Committee: DeLiang Wang (Advisor); Eric Fosler-Lussier (Committee Member); Eric Healy (Committee Member) Subjects: Computer Science; Engineering
  • 6. Ishikawa, Keiko Towards Development of Intelligibility Assessment for Dysphonic Speech

    PhD, University of Cincinnati, 2017, Allied Health Sciences: Communication Sciences and Disorders

    Dysphonia affects one's ability to produce voice. Because voice is the most fundamental component of speech, it is not surprising that many speakers with dysphonia report decreased intelligibility in everyday communication environments. Despite this report, intelligibility is not routinely measured in clinical assessment of dysphonia. Because the intelligibility deficit is most noticeable in noisy environments, measuring the deficit requires thorough understanding of the interaction between dysphonia, background noise and intelligibility. Understanding this interaction would also help development of treatment approaches for maximizing intelligibility for chronically dysphonic patients. The objective of this project was to establish experimental models for investigating this interaction. In particular, potential of perceptual and acoustic approaches were examined. The overarching goal of this research is to develop a clinical tool for assessment of intelligibility in dysphonic speech and treatment approaches for improving the intelligibility deficit. Two speech perception experiments were conducted to determine the effect of background noise on intelligibility. The first study measured intelligibility with a transcription-based method in quiet and noise. The second study measured intelligibility with a rating-based method in three levels of noise. Correlation between these measurements was examined to evaluate their agreement. The results indicated that these measurements strongly correlate; however, their relationship with ratings of voice quality significantly differed. These findings indicate that both of these measures could be used to evaluate intelligibility of dysphonic speech; however, they are not equivalent measures. The acoustic studies were conducted for two purposes: 1) to examine utility of an existing clinical acoustic measure, cepstral peak prominence (CPP), for prediction of intelligibility deficit in dysphonic speech, and 2) to (open full item for complete abstract)

    Committee: Suzanne Boyce Ph.D. (Committee Chair); Alessandro de Alarcon M.D. M.P.H. (Committee Member); Lisa Kelchner Ph.D. (Committee Member); Siddarth Khosla M.D. (Committee Member); Marepalli Rao Ph.D. (Committee Member) Subjects: Speech Therapy
  • 7. Wang, Yuxuan Supervised Speech Separation Using Deep Neural Networks

    Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering

    Speech is crucial for human communication. However, speech communication for both humans and automatic devices can be negatively impacted by background noise, which is common in real environments. Due to numerous applications, such as hearing prostheses and automatic speech recognition, separation of target speech from sound mixtures is of great importance. Among many techniques, speech separation using a single microphone is most desirable from an application standpoint. The resulting monaural speech separation problem has been a central problem in speech processing for several decades. However, its success has been limited thus far. Time-frequency (T-F) masking is a proven way to suppress background noise. With T-F masking as the computational goal, speech separation reduces to a mask estimation problem, which can be cast as a supervised learning problem. This opens speech separation to a plethora of machine learning techniques. Deep neural networks (DNN) are particularly suitable to this problem due to their strong representational capacity. This dissertation presents a systematic effort to develop monaural speech separation systems using DNNs. We start by presenting a comparative study on acoustic features for supervised separation. In this relatively early work, we use support vector machine as classifier to predict the ideal binary mask (IBM), which is a primary goal in computational auditory scene analysis. We found that traditional speech and speaker recognition features can actually outperform previously used separation features. Furthermore, we present a feature selection method to systematically select complementary features. The resulting feature set is used throughout the dissertation. DNN has shown success across a range of tasks. We then study IBM estimation using DNN, and show that it is significantly better than previous systems. Once properly trained, the system generalizes reasonably well to unseen conditions. We demonstrate that our sy (open full item for complete abstract)

    Committee: DeLiang Wang (Advisor); Eric Fosler-Lussier (Committee Member); Mikhail Belkin (Committee Member); Eric Healy (Committee Member) Subjects: Computer Science; Engineering
  • 8. Woodruff, John Integrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments

    Doctor of Philosophy, The Ohio State University, 2012, Computer Science and Engineering

    The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. Binaural processing, where input signals resemble those that enter the two ears, is of particular interest in the CASA field. The dominant approach to binaural segregation has been to derive spatially selective filters in order to enhance the signal in a direction of interest. As such, the problems of sound localization and sound segregation are closely tied. While spatial filtering has been widely utilized, substantial performance degradation is incurred in reverberant environments and more fundamentally, segregation cannot be performed without sufficient spatial separation between sources. This dissertation addresses the problems of binaural localization and segregation in reverberant environments by integrating monaural and binaural cues. Motivated by research in psychoacoustics and by developments in monaural CASA processing, we first develop a probabilistic framework for joint localization and segregation of voiced speech. Pitch cues are used to group sound components across frequency over continuous time intervals. Time-frequency regions resulting from this partial organization are then localized by integrating binaural cues, which enhances robustness to reverberation, and grouped across time based on the estimated locations. We demonstrate that this approach outperforms voiced segregation based on either monaural or binaural analysis alone. We also demonstrate substantial performance gains in terms of multisource localization, particularly for distant sources in reverberant environments and low signal-to-noise ratios. We then develop a binaural system for joint localiza (open full item for complete abstract)

    Committee: DeLiang Wang PhD (Advisor); Mikhail Belkin PhD (Committee Member); Eric Fosler-Lussier PhD (Committee Member); Nicoleta Roman PhD (Committee Member) Subjects: Acoustics; Artificial Intelligence; Computer Science; Electrical Engineering
  • 9. Hardman, Jocelyn The intelligibility of Chinese-accented English to international and American students at a U.S. university

    Doctor of Philosophy, The Ohio State University, 2010, EDU Teaching and Learning

    This study investigated the intelligibility of Chinese graduate students to their Indian, Chinese, Korean, and American peers. Specifically, the researcher sought to determine the teaching priorities for English for Academic Purposes in the US, where listeners have a wide variety of native languages. Research on Second Language Acquisition (SLA), International Teaching Assistants, and English as a Lingua Franca (ELF) has not provided sufficient empirical data on the factors that affect intelligible English communication among academic professionals with many native languages (L1). SLA has focused on the processes and factors affecting the acquisition of second language (L2) phonology; and ITA research has focused on the communication needs of international graduate students teaching American undergraduates. Both perspectives examine the intelligibility of foreign-accented speech to native English-speaking listeners. World Englishes (WE) and ELF argue for more research from the perspective of L2 listeners, which thus far has largely been limited to linguistic descriptions and case studies. A psycholinguistic word-recognition-in-noise study was designed to examine to what extent a talker's L1 and segmental pronunciation accuracy affected intelligibility, and how this varied by a listener's L1 and word familiarity. Participants included 6 male graduate students (Chinese & American) as talkers and 72 graduate students (Indian, Chinese, Korean, & American) as listeners. The oral English proficiency level of the international participants was held constant at “graduate TA certification” and all American listeners were natives of Ohio. Talkers were recorded reading 60 sentences from the Bamford-Kowal-Bench Standard Sentence Lists, revised for American English. The stimuli were mixed with white noise at a +5 dB signal-to-noise ratio and presented in a counterbalanced design to listeners, who transcribed the sentences they heard. Intelligibility was calculated using a dich (open full item for complete abstract)

    Committee: Keiko Samimy PhD (Advisor); Mary Beckman PhD (Advisor); Shari Speer PhD (Committee Member) Subjects: Education; Language; Linguistics; Mathematics Education; Science Education
  • 10. Kroes, Patricia Relative intelligibility of English and Spanish as functions of the native language of the speaker and the listener /

    Master of Arts, The Ohio State University, 1971, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 11. Arora, Kamlesh The ability of American and foreign listeners to identify short samples of recorded speech of foreign and American speakers /

    Master of Arts, The Ohio State University, 1968, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 12. Spotts, AnnaKate Exploring Perceptual and Hemodynamic Responses to Speech Intelligibility in Healthy Listeners using Functional Near Infrared Spectroscopy

    PhD, University of Cincinnati, 2024, Allied Health Sciences: Communication Sciences and Disorders

    Speech perception is a critical part of communication between speakers and listeners. One important aspect of speech perception is the listeners' neurological processes during perception. Functional Near Infrared Spectroscopy (fNIRS) may be a viable tool to study the neurological underpinnings of speech perception in increasingly naturalistic environments due to its portability and low-cost. The study of speech perception using fNIRS has identified similar findings to fMRI in healthy listeners when using artificially manipulated speech samples that are easier (high intelligibility) or more difficult (low intelligibility) to understand. What is missing is the examination of these neural correlates during perception of naturally disordered speech. Study one addressed this by evaluating brain activity changes via measurement of hemodynamic (blood flow) response, using fNIRS, in healthy listeners using naturally disordered speech samples at high and low intelligibility. Another important aspect of speech perception is the listeners' perceptual responses to speech. Perceptual responses provide insight into listener impression and are used clinically to assess speech severity and track changes in intelligibility over time. Correlations between hemodynamic response and perceptual response are not well understood. Study two addressed this by examining the correlation between hemodynamic responses collected via fNIRS and perceptual ratings collected via visual analog scales (VAS). Healthy adults (N=24; 16 female, 8 male; 19.46 – 49.92 years old) participated in an event-related fNIRS study involving 24 randomized listening trials. High intelligibility sentence stimuli (12) were all rated 100% intelligible, and low intelligibility sentence stimuli (12) were rated as >30% but <80% intelligible. After each listening trial, participants placed ratings for intelligibility, comprehensibility, naturalness, and listening effort on a VAS. Hemodynamic responses (HbO2) w (open full item for complete abstract)

    Committee: Carrie Rountrey Ph.D. (Committee Chair); Pierce Boyne D.P.T. (Committee Member); Peter Scheifele Ph.D. (Committee Member); Jennifer Vannest Ph.D. (Committee Member) Subjects: Speech Therapy
  • 13. Delfarah, Masood Deep learning methods for speaker separation in reverberant conditions

    Doctor of Philosophy, The Ohio State University, 2019, Computer Science and Engineering

    Speech separation refers to the problem of separating target speech from acoustic interference such as background noise, room reverberation and other speakers. An effective solution to this problem can improve speech intelligibility of human listeners and speech processing systems. Speaker separation is one kind of speech separation in which the interfering source is also human speech. This dissertation addresses the speaker separation problem in reverberant environments. The goal is to increase the speech intelligibility of hearing-impaired and normal-hearing listeners in those conditions. Speaker separation is traditionally approached using model-based methods such as Gaussian mixture models (GMMs) or hidden Markov models (HMMs). These methods are unable to generalize to challenging cases with unseen speakers or nonstationary noise. We employ supervised learning for the speaker separation problem. The idea is inspired from studies that introduced deep neural networks (DNNs) to speech-nonspeech separation. In this approach, training data is used to learn a mapping function from noisy speech features to an ideal time-frequency (T-F) mask. We start this study by investigating an extensive set of acoustic features extracted in adverse conditions. DNNs are used as the learning machine, and separation performance is evaluated using standard objective speech intelligibility metrics. Separation performance is systematically evaluated in both nonspeech and speech interference, a variety of SNRs, reverberation times, and direct-to-reverberant energy ratios. We construct feature combination sets using a sequential floating forward selection algorithm, and combined features outperform individual ones. Next, we address the problem of separating two-talker mixtures in reverberant conditions. We employ recurrent neural networks (RNNs) with bidirectional long short-term memory (BLSTM) to separate and dereverberate the target speech signal. We propose two-stage networks t (open full item for complete abstract)

    Committee: DeLiang Wang (Advisor); Fosler-Lussier Eric (Committee Member); Healy Eric (Committee Member) Subjects: Artificial Intelligence; Computer Engineering; Computer Science
  • 14. Beam, Gaylene Dyslexia and the Perception of Indexical Information in Speech

    Doctor of Philosophy, The Ohio State University, 2019, Speech and Hearing Science

    Dyslexia is a specific learning disability that is neurobiological in origin. It is characterized by difficulties with accurate and/or fluent written word recognition and by poor spelling and decoding abilities. These difficulties are typically the result of a deficit in the phonological component of language. A variety of studies have pointed to an association of this impaired phonological processing ability with the perception of speech. This dissertation consists of three separate yet interrelated experiments designed to examine the roles that dyslexia and indexical features play in speech perception. The purpose of Experiment 1 was to determine whether the underlying phonological impairment seen in adults and children with dyslexia is associated with a deficit in the ability to categorize regional dialects. Our results confirmed our hypothesis that individuals with dyslexia would perform more poorly than average reading controls in regional dialect categorization tasks. In addition, we found that listeners' phonological processing ability (in specific, phonological short-term memory) was associated with listeners' sensitivity to dialect. Children performed more poorly than did adults. Children with dyslexia performed more poorly than did the child control group. Building on Experiment 1, Experiment 2 further inquired into sensitivity to indexical information (talker dialect and talker sex) in adults and children with dyslexia using stimuli that varied the nature and the redundancy of acoustic cues (namely, low-pass filtered speech and noise-vocoded speech). Our results supported our previous findings. Overall, listeners with dyslexia performed more poorly on categorization tasks than did controls. Children performed more poorly than adults in all conditions. We also found that for talker dialect identification, all listeners were most sensitive to dialect cues in clear speech, followed by vocoded speech. Listeners were least sensitive to dialect in low-pa (open full item for complete abstract)

    Committee: Robert Fox PhD (Advisor); Rebecca McCauley PhD (Committee Member); Ewa Jacewicz PhD (Committee Member) Subjects: Linguistics; Sociolinguistics; Special Education
  • 15. Shukla, Saurabh Development of a Human-AI Teaming Based Mobile Language Learning Solution for Dual Language Learners in Early and Special Educations

    Master of Science (MS), Wright State University, 2018, Computer Science

    Learning English as a secondary language is often an overwhelming challenge for dual language learners (DLLs), whose first language (L1) is not English, especially for children in early education (PreK-3 age group). These early DLLs need to devote a considerable amount of time learning to speak and read the language, in order to gain the language proficiency to function and compete in the classroom. Fear of embarrassment when mispronouncing words in front of others may drive them to remain silent; effectively hampering their participation in the class and overall curricular growth. The process of learning a new language can benefit greatly from the latest computing technologies, such as mobile computing, augmented reality and artificial intelligence. This research focuses on developing a human-AI teaming based mobile learning system for early DLLs. The objective is to provide a supportive and interactive platform for them to develop English reading and pronunciation skills through individual attention and interactive coaching. In this thesis, we present an AR and AI-based mobile learning tool that provides: 1) automatic and accurate intelligibility analysis at various levels: letter, word, phrase and sentences, 2) immediate feedback and multimodal coaching on how to correct pronunciation, and 3) evidence-based dynamic training curriculum tailored for personalized learning patterns and needs, e.g., retention of corrected pronunciation and typical pronunciation errors. The use of visible and interactive virtual expert technology capable of intuitive AR-based interactions will greatly increase a student's acceptance and retention of a virtual coach. In school or at home, it will readily resemble an expert reading specialist to effectively guide and assist a student in practicing reading and speaking by him-/herself independently, which is particularly important for DLL as many of their parents don't speak English fluently and cannot offer the necessary help. Ulti (open full item for complete abstract)

    Committee: Yong Pei Ph.D. (Advisor); Anna Lyon Ed.D. (Committee Member); Mateen Rizki Ph.D. (Committee Member) Subjects: Computer Science
  • 16. Kim, Sasha Perception of Regional Dialects in 2-Talker Masking Speech by Korean-English Bilinguals

    Master of Arts, The Ohio State University, 2018, Speech Language Pathology

    Speech comprehension in multitalker backgrounds is particularly challenging for non-native listeners. Previous studies have shown that although native listeners consistently outperform non-native listeners in listening comprehension tasks, both native and non-native listeners are sensitive to dialectal cues in target and masking speech when targets are presented in background babble of varying intensities. The present study examined the listening comprehension skills of 24 Korean-English bilingual listeners who were presented with sentence stimuli in two-talker babble. Stimuli and babble were comprised of two dialects, General American English (GAE) and Southern American English (SAE), which were systematically varied throughout testing at three signal-to-noise ratios (SNRs). In a previous study by Fox, Jacewicz, and Hardjono (2014), the nonnative (Indonesian-English bilingual) pattern of responses was highly similar to that of native GAE listeners in the +3 and -3 dB SNR conditions, but differed at the 0 dB condition. The present study aimed to determine whether or not these results are replicable in a comparable group of non-native bilinguals with a different L1 (Korean). Results revealed that the pattern of responses matched that of Fox et al. (2014) in all conditions; like both Indonesian listeners and monolingual GAE listeners, Korean listeners performed best when target sentences were in SAE at the +3 dB and -3 dB SNR conditions. However, like Indonesian listeners, the Korean listeners demonstrated no benefit from the acoustic features of SAE targets at the 0 dB SNR condition. These findings suggest that bilinguals are consistent in their comprehension of target sentences in masking speech irrespective of their L1 background; unlike at +3 dB and -3 dB conditions, at 0 dB SNR, nonnative listeners exhibit a decreased ability to attend to phonetic details of regional dialects.

    Committee: Robert Fox (Advisor); Ewa Jacewicz (Advisor) Subjects: Acoustics; English As A Second Language; Speech Therapy
  • 17. Soni, Jasminkumar Determining The Effect Of Speaker's Gender And Speech Synthesis On Callsign Acquisition Test (CAT) Results

    Master of Science in Engineering (MSEgr), Wright State University, 2009, Industrial and Human Factors Engineering

    Effective and efficient speech communication is one of the leading factors for success of battlefield operation. With the increases in the levels of gender diversity in military services, it is important to assess the effectiveness of voice for both genders in communication systems. The purpose of this research study was to determine the effect of the speaker's voice (male and female) on the speech intelligibility (SI) performance of the Callsign Acquisition Test (CAT). In addition, the effects of synthesized speech were evaluated. The CAT test is a new SI test that has been developed for military use. A group of 21 listeners with normal hearing participated in the study. Each participant listened to four different lists of CAT (male and female natural recorded speech, and male and female synthetic speech) at two signal-to-noise ratios. White noise was used as a masking noise and various speech files were mixed at signal-to-noise ratios -12 dB and -15 dB. Each wordlist was played at 50dB and 53dB mixed with white noise at 65dB. Each listener participated in a total of 8 tests presented in a random fashion. Testing was performed in a sound treated booth with loud speakers. Test results demonstrated that male speech and natural voice have higher SI results than female speech and synthetic voice respectively. Also statistical analysis concluded that female speech, -15 dB SNR, synthetic voice, and combination effect of female speech and synthetic voice all have significant effect on CAT test results in the presence of white noise. All tests used significance levels alpha = 0.5.

    Committee: Misty Blue Ph.D. (Advisor); Yan Liu Ph.D. (Committee Member); Blair Rowley Ph.D. (Committee Member) Subjects: Acoustics; Biomedical Research; Education; Engineering; Industrial Engineering
  • 18. Verbsky, Babette EFFECTS OF CONVENTIONAL PASSIVE EARMUFFS,UNIFORMLY ATTENUATING PASSIVE EARMUFFS, AND HEARING AIDS ON SPEECH INTELLIGIBILITY IN NOISE

    Doctor of Philosophy, The Ohio State University, 2002, Speech and Hearing Science

    Occupational hearing conservation regulations neither address issues related to speech intelligibility in noise for normal-hearing or hearing-impaired workers, nor do the regulations comment on the safety of hearing aid use by hearing-impaired workers. Do certain types of hearing protection devices (HPDs) allow for better speech intelligibility than others? Would use of hearing aids with earmuffs provide better speech intelligibility for hearing-impaired workers? Is this method of accommodation safe? To answer these questions, a method for evaluating speech intelligibility with HPDs was developed through a series of pilot tests. The test method allows for evaluation of both normal-hearing and hearing-impaired listeners. Speech intelligibility for normal-hearing listeners who wore uniformly attenuating earmuffs was found to be significantly better than for the same listeners who wore conventional earmuffs. Hearing-impaired listeners were tested with each type of earmuff and while wearing their own hearing aids in combination with each earmuff. Unlike the normal hearing listener group, the hearing-impaired listener group did not exhibit better speech intelligibility with the uniformly attenuating earmuffs than with the conventional earmuffs. However, earmuffs worn in combination with hearing aids allowed for significantly better speech intelligibility than with either earmuff alone. To determine the safety of hearing aid use under earmuffs, a model was developed to predict occupational noise exposure for the aided-protected worker. Data from real ear measures with an acoustic mannequin was found to be in agreement with model predictions.

    Committee: Lawrence Feth (Advisor) Subjects:
  • 19. Banzina, Elina The Role of Secondary-stressed and Unstressed-unreduced Syllables in Word Recognition: Acoustic and Perceptual Studies with Russian Learners of English

    Doctor of Philosophy (Ph.D.), Bowling Green State University, 2012, Communication Disorders

    Identifying those phonological factors that native listeners rely on most when perceiving non-native speech is critical for setting priorities in pronunciation instruction. The importance of accurate lexical stress production, particularly primary stress, has been explored. However, little is known about the role of Secondary-stressed (SS) syllables and Unstressed-unreduced (UU) syllables, and the importance of their accuracy for speech perception. These questions are of relevance for Russian learners of English, who often reduce English SS and UU vowels—a phenomenon which is arguably due to the fact that only one stressed syllable per word is allowed in Russian phonology. Moreover, second language research has not addressed the issue of vowel over-reduction, which is a pattern typical of Russian learners. Low-accuracy productions of SS and UU syllables are generally not expected to lead to unintelligibility; however, they might interfere with the ease and accuracy with which speech is perceived. An acoustic study first compared realization of SS and UU syllables in words produced in isolation by six Russian learners of English and six native English speakers. Words were selected to contain low vowels and specific UU and SS syllable positions to optimally reflect vowel reduction by Russian speakers. Acoustic analyses revealed significant vowel quality and duration reductions in Russian-spoken SS and UU vowels, which were only half the duration of native English productions and significantly centralized. A subsequent psycholinguistic perceptual study investigated the degree of interference that inaccurate productions of SS and UU syllables have on native listeners' speech processing. A cross-modal phonological priming technique combined with a lexical decision task assessed speech processing of 28 native English speakers as they listened to (1) native English speech, (2) unmodified Russian speech, and (3) modified Russian speech with SS and UU syllables altered to ma (open full item for complete abstract)

    Committee: Laura Dilley PhD (Advisor); Lynne Hewitt PhD (Committee Chair); Sheri Wells-Jensen PhD (Committee Member); Alexander Goberman PhD (Committee Member); John Folkins PhD (Committee Member) Subjects: English As A Second Language; Linguistics