Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 78)

Mini-Tools

 
 

Search Report

  • 1. Neuhaus, TJ Gender Perception Dependent on Fundamental Frequency, Source Spectral Tilt, and Formant Frequencies

    Master of Science (MS), Bowling Green State University, 2020, Communication Disorders

    Objective. To explore how listeners use three aspects of the acoustic signal in the novel context of formant space configurations to determine speaker gender. Methods. The software Madde, Praat, and Audacity were used to synthesize 210 sound files that each contain the vowels /i, æ, ɑ, u/ separated by brief silences (i.e., the formant space configuration context). The 210 files were created by combining 10 values for fundamental frequency, seven sets of formant frequencies (vocal tract length), and three values for source spectral tilt. The lowest values for formant frequencies (longest value for vocal tract length) and fundamental frequency each correspond to the values for the average male. The highest values for formant frequencies (shortest vocal tract length) and fundamental frequency each correspond to the values for the average female. The values for source spectral tilt approximate the voice qualities of breathy, normal, and pressed. Twenty-three listeners judged the gender of the “speaker” of the synthesized sounds as female or male. Results. Increases in fundamental frequency and formant frequencies (decreases in vocal tract length) correlated with increased likelihood of judgement of female. An interaction between source spectral tilt and formant frequencies (vocal tract length) revealed that an increase in the steepness of source spectral tilt increased likelihood of judgement of female only when formant frequencies were high (vocal tract length was short). An interaction between formant frequencies (vocal tract length) and fundamental frequency revealed listeners were more sensitive to changes in fundamental frequency when formant frequencies were high (vocal tract length was short). Conclusions. Both fundamental frequency and formant frequencies are strong cues to speaker gender. The contribution of other cues, such as source spectral tilt were subtle. The observed interactions point to gender aspects of speech perception as a complex phenomen (open full item for complete abstract)

    Committee: Ronald Scherer PhD (Advisor); Brent Archer PhD, CCC-SLP (Committee Member); Jason Whitfield PhD, CCC-SLP (Committee Member) Subjects: Acoustics; Speech Therapy
  • 2. Smith, Virginia Self-perceptive and projective-sound discrimination of children with defective articulation /

    Master of Arts, The Ohio State University, 1965, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 3. Ross, Brownell A study of speech sound discrimination ability of culturally disadvantaged children /

    Master of Arts, The Ohio State University, 1967, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 4. Needham, Ellen The Relative ability of aphasic persons to judge the duration and the intensity of pure tones /

    Master of Arts, The Ohio State University, 1969, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 5. Schalk, Mary A prognostic and normative study of the articulatory proficiency of kindergarten children /

    Master of Arts, The Ohio State University, 1967, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 6. Zilch, Marsha A study of gross and fine sound identification abilities in children with diagnosed auditory perceptual problems /

    Master of Arts, The Ohio State University, 1967, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 7. Pressman, Dorothy Intelligibility and some perceptual confusions associated with three modes of speaking and filtering /

    Master of Arts, The Ohio State University, 1968, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 8. Spotts, AnnaKate Exploring Perceptual and Hemodynamic Responses to Speech Intelligibility in Healthy Listeners using Functional Near Infrared Spectroscopy

    PhD, University of Cincinnati, 2024, Allied Health Sciences: Communication Sciences and Disorders

    Speech perception is a critical part of communication between speakers and listeners. One important aspect of speech perception is the listeners' neurological processes during perception. Functional Near Infrared Spectroscopy (fNIRS) may be a viable tool to study the neurological underpinnings of speech perception in increasingly naturalistic environments due to its portability and low-cost. The study of speech perception using fNIRS has identified similar findings to fMRI in healthy listeners when using artificially manipulated speech samples that are easier (high intelligibility) or more difficult (low intelligibility) to understand. What is missing is the examination of these neural correlates during perception of naturally disordered speech. Study one addressed this by evaluating brain activity changes via measurement of hemodynamic (blood flow) response, using fNIRS, in healthy listeners using naturally disordered speech samples at high and low intelligibility. Another important aspect of speech perception is the listeners' perceptual responses to speech. Perceptual responses provide insight into listener impression and are used clinically to assess speech severity and track changes in intelligibility over time. Correlations between hemodynamic response and perceptual response are not well understood. Study two addressed this by examining the correlation between hemodynamic responses collected via fNIRS and perceptual ratings collected via visual analog scales (VAS). Healthy adults (N=24; 16 female, 8 male; 19.46 – 49.92 years old) participated in an event-related fNIRS study involving 24 randomized listening trials. High intelligibility sentence stimuli (12) were all rated 100% intelligible, and low intelligibility sentence stimuli (12) were rated as >30% but <80% intelligible. After each listening trial, participants placed ratings for intelligibility, comprehensibility, naturalness, and listening effort on a VAS. Hemodynamic responses (HbO2) w (open full item for complete abstract)

    Committee: Carrie Rountrey Ph.D. (Committee Chair); Pierce Boyne D.P.T. (Committee Member); Peter Scheifele Ph.D. (Committee Member); Jennifer Vannest Ph.D. (Committee Member) Subjects: Speech Therapy
  • 9. Alzoubi, Hamada USING EYE TRACKING AND PUPILLOMETRY TO UNDERSTAND THE IMPACT OF AUDITORY AND VISUAL NOISE ON SPEECH PERCEPTION

    PHD, Kent State University, 2023, College of Education, Health and Human Services / School of Health Sciences

    Although speech recognition is often experienced as relatively effortless, there are a number of common challenges that can make speech perception more difficult and may greatly impact speech intelligibility (e.g., environmental noise). However, there is some indication that visual cues can be also used to improve speech recognition (Baratchu et al., 2008) — especially when the visual information is congruent with the speech signal (e.g., talking faces; Massaro, 2002). However, it is less clear how noisy visual environments may impact speech perception when the visual signal is not congruous with the speech signal. In fact, adding incongruous visual information will likely detract precious cognitive resources away from the auditory process, making speech perception in noise a more cognitively difficult task. Therefore, the purpose of this dissertation was to examine cognitive processing effort by measuring changes in pupillary response during the processing of speech in noise paired with incongruous visual noise. The primary hypothesis was that noisy visual information would negatively impact the processing of speech in noisy environments and that would result in a greater pupil diameter. To test this I used a common eye-tracking measure (i.e., pupillometry) to assess the cognitive processing effort needed to process speech in the presence of congruent and incongruous visual noise. The results indicated that visual noise recruits cognitive processing effort away from the auditory signal. Results also indicated that different combinations of auditory and visual noise have a significant impact on cognitive processing effort, which led to an increase in pupil dilation response during speech perception.

    Committee: JENNIFER ROCHE (Advisor); BRADLEY MORRIS (Committee Member); BRUNA MUSSOI (Committee Member); JOCELYN FOLK (Other) Subjects: Audiology; Cognitive Psychology; Neurosciences
  • 10. Bland, Justin Perception of unstressed vowel reduction in Central Mexican Spanish

    Doctor of Philosophy, The Ohio State University, 2023, Spanish and Portuguese

    The purpose of this dissertation is to examine the perception of unstressed vowel reduction (UVR)—also known as vowel devoicing—in Central Mexican Spanish. UVR is a variable, gradient process in which vowels undergo a constellation of phonetic weakening processes including shortening, devoicing, and apparent deletion (Gordon 1998). While it is fairly common and has been well-studied in languages such as Japanese (Beckman and Shoji 1984) and Portuguese (Cunha 2015), its use is more limited in Spanish, and it is primarily associated with two regions: the highland plateau of Central Mexico and the Andean highlands. Although previous literature has examined the production of UVR in Spanish (Dabkowski 2018, Delforge 2008b) and the perception of UVR in other languages (Beckman and Shoji 1984, Meneses and Albano 2015), studies on the perception of UVR in Spanish are limited to Delforge's (2012) work on language attitudes in Cusco and a small-scale perception task in Perissinotto (1975). This leaves open multiple questions about how Spanish-speaking listeners perceive UVR, what factors influence their perception, and how UVR relates to issues of dialect perception and sociolinguistic awareness. This dissertation therefore seeks to provide an initial but wide-ranging view of the perception of UVR in Central Mexican Spanish by examining multiple aspects of listeners' perception. This was done by preparing a set of two perception experiments designed to test two overarching goals: first, how linguistic factors like phonetic variation, phonological context, and morphological context affect listeners' ability to perceive vowels; and second, what role UVR plays in listeners' dialect classification and language attitudes toward speakers. Additionally, questions of whether non-linguists notice UVR and whether listeners from different dialect areas differ in their perception were tested. The two perception experiments were administered online via Qualtrics, and a total of 84 part (open full item for complete abstract)

    Committee: Rebeka Campos-Astorkiza (Advisor); Scott A. Schwenter (Committee Member); Fernando Martínez-Gil (Committee Member) Subjects: Linguistics
  • 11. Arzbecker, Lian Evaluating Levenshtein distance: Assessing perception of accented speech through low- and high-pass filtering

    Doctor of Philosophy, The Ohio State University, 2023, Speech and Hearing Science

    This dissertation explores the relationship between quantitative phonetic measurements and listener identification of accents of English, focusing on phonetic distance and its perceptual correlates across various English accent varieties. The Levenshtein distance (LD) measure, which quantifies string similarity by calculating the minimum cost of transforming one string into another, is used to compare phonetic differences across accents. This study begins by investigating the diverse applications of LD across disciplines, emphasizing its significance in dialectology. Evaluation of different LD approaches and algorithms reveals that simpler methods often yield analogous or superior results compared to more complex ones. Insights from analyzing LD trends inform the selection of the algorithm chosen for the current experiment. Subsequently, carefully selected low- and high-pass filter cutoffs enable investigation of target phonetic features. Four English accent varieties are included in this research: Midland American (control), British/Australian, Hindi-influenced, and Mandarin-influenced. Hypotheses and predictions are formulated based on the documented correlations between LD and listeners' perception ratings of native-likeness and intelligibility. For monolingual American English-speaking listeners, frequent confusion is predicted between Midland American and British/Australian accents due to their similarly low LDs, amplified by filtering conditions altering vowel and consonant cues. Conversely, higher LDs are hypothesized for Hindi- and Mandarin-influenced English due to the influence of various first languages (L1s). It is expected that these two varieties will be more frequently confused with each other in unmodified identification tasks due to their relatively high LDs. The impact of filtering conditions on confusion is predicted to differ for each variety, with high-pass filtering affecting Hindi-influenced English due to consonant substitutions and low (open full item for complete abstract)

    Committee: Ewa Jacewicz (Advisor) Subjects: Acoustics; Cognitive Psychology; Linguistics
  • 12. Egbert, Elizabeth An Investigation of the Relationship Between Pure Tone Thresholds and Speech Reception Thresholds in Children, As a Function of Age and Sex

    Master of Arts (MA), Bowling Green State University, 1966, Communication Studies

    Committee: James J. Egan (Advisor) Subjects: Speech Therapy
  • 13. Havens, Ethel An Experimental Investigation of Speech Perception Among Hard-of-Hearing Children

    Master of Science (MS), Bowling Green State University, 1956, Communication Studies

    Committee: Melvin Hyman (Advisor) Subjects: Audiology
  • 14. Johnson, Eric Improving Speech Intelligibility Without Sacrificing Environmental Sound Recognition

    Doctor of Philosophy, The Ohio State University, 2022, Speech and Hearing Science

    The three manuscripts presented here examine concepts related to speech perception in noise and ways to overcome poor speech intelligibility without depriving listeners of environmental sound recognition. Because of hearing-impaired (HI) listeners' auditory deficits, there is a substantial need for speech-enhancement (noise reduction) technology. Recent advancements in deep learning have resulted in algorithms that significantly improve the intelligibility of speech in noise, but in order to be suitable for real-world applications such as hearing aids and cochlear implants, these algorithms must be causal, talker independent, corpus independent, and noise independent. Manuscript 1 involves human-subjects testing of a novel, time-domain-based algorithm that fulfills these fundamental requirements. Algorithm processing resulted in significant intelligibility improvements for both HI and normal-hearing (NH) listener groups in each signal-to-noise ratio (SNR) and noise type tested. In Manuscript 2, the range of speech-to-background ratios (SBRs) over which NH and HI listeners can accurately perform both speech and environmental recognition was determined. Separate groups of NH listeners were tested in conditions of selective and divided attention. A single group of HI listeners was tested in the divided attention experiment. Psychometric functions were generated for each listener group and task type. It was found that both NH and HI listeners are capable of high speech intelligibility and high environmental sound recognition over a range of speech-to-background ratios. The range and location of optimal speech-to-background ratios differed across NH and HI listeners. The optimal speech-to-background ratio also depended on the type of environmental sound present. Conventional deep-learning algorithms for speech enhancement target maximum intelligibly by removing as much noise as possible while maintaining the essential characteristics of the target speech signal (open full item for complete abstract)

    Committee: Eric Healy (Advisor); Rachael Holt (Committee Member); DeLiang Wang (Committee Member) Subjects: Acoustics; Artificial Intelligence; Audiology; Behavioral Sciences; Communication; Computer Engineering; Health Sciences
  • 15. Wasiuk, Peter The Importance of Glimpsed Audibility for Speech-In-Speech Recognition

    Doctor of Philosophy, Case Western Reserve University, 2022, Communication Sciences

    Purpose: Speech recognition in the presence of competing speech can be challenging, and individuals vary considerably in their ability to accomplish this complex auditory-cognitive task. Speech-in-speech recognition can vary due to factors that are intrinsic to the listener, such as hearing status and cognitive abilities, or due to differences in the short-term audibility of the target speech. The primary goal of the current experiments was to characterize the effects of glimpsed target audibility and intrinsic listener variables on speech-in-speech recognition. Methods: Three experiments were conducted to evaluate the effects of glimpsed target audibility, intrinsic listener variables, and acoustic-perceptual difference cues on speech-in-speech and speech-in-noise recognition. Listeners were young adults (18 to 28 years) with normal hearing. Speech recognition was measured in two stages in each experiment. In Stage 1, speech reception thresholds were measured adaptively to estimate the signal-to-noise ratio (SNR) associated with 50% correct keyword recognition for each listener in each stimulus condition. In Stage 2, keyword recognition was measured at a fixed-SNR in each stimulus condition. All participants completed a battery of cognitive measures that assessed central abilities related to masked-speech recognition. The proportion of audible target glimpses for each target+masker keyword stimulus presented in the fixed-SNR testing was measured using a computational glimpsing model of speech recognition. Results: Results demonstrated that variability in both speech-in-speech and speech-in-noise recognition depends critically on the proportion of audible target glimpses available in the target+masker mixture, even across stimuli presented at the same global SNR. Glimpsed target audibility requirements for successful speech recognition varied systematically as a function of informational masking. Young adult listeners required a greater proportion of audibl (open full item for complete abstract)

    Committee: Lauren Calandruccio (Committee Chair); Christopher Burant (Committee Member); Barbara Lewis (Committee Member); Robert Greene (Committee Member) Subjects: Audiology; Behavioral Sciences; Experimental Psychology
  • 16. Barrett, Jenna Perception of Spectrally-Degraded, Foreign-Accented Speech

    Bachelor of Science (BS), Ohio University, 2021, Communication Sciences and Disorders

    This study investigated the perception of spectrally-degraded, Mandarin-accented speech in native English listeners. Twenty-four native Mandarin speakers and two native English speakers recorded a reading of the Rainbow passage in English. A sample of this recording was played for 105 Ohio University students who rated the accents on a scale of 1 to 9. Based on this rating task, two speakers with slight accent and two speakers with strong accent were chosen to record both HINT and R-SPIN sentence lists. The R-SPIN sentence lists consisted of HP and LP sentences, which were used to assess the effect of contextual information on speech recognition of native English listeners. Sentences were presented to twenty native English listeners in both the original condition and a 6-channel vocoder processed condition. Results revealed that the combination of accented speech and vocoder processing was detrimental to speech recognition in native English listeners. Additionally, data from R-SPIN sentence lists suggest that the presence of accent limited listeners' ability to utilize contextual information. These adverse effects increased as the degree of accent increased. Results also suggested a possible link between talker sex and recognition performance; recognition accuracies were higher overall for the noise-vocoded, foreign-accented speech produced by female talkers.

    Committee: Li Xu (Advisor) Subjects: Communication
  • 17. Shatzer, Hannah Neural Correlates of Unimodal and Multimodal Speech Perception in Cochlear Implant Users and Normal-Hearing Listeners

    Doctor of Philosophy, The Ohio State University, 2020, Psychology

    Spoken word recognition often involves the integration of both auditory and visual speech cues. The addition of visual cues is particularly useful for individuals with hearing loss and cochlear implants (CIs), as the auditory signal they perceive is degraded compared to individuals with normal hearing (NH). CI users generally benefit more from visual cues than NH perceivers; however, the underlying neural mechanisms affording them this benefit are not well-understood. The current study sought to identify the neural mechanisms active during auditory-only and audiovisual speech processing in CI users and determine how they differ from NH perceivers. Postlingually deaf experienced CI users and age-matched NH adults completed syllable and word recognition tasks during EEG recording, and the neural data was analyzed for differences in event-related potentials and neural oscillations. The results showed that during phonemic processing in the syllable task, CI users have stronger AV integration, shifting processing away from primary auditory cortex and weighting the visual signal more strongly. During whole-word processing in the word task, early acoustic processing is preserved and similar to NH perceivers, but again displaying robust AV integration. Lipreading ability also predicted suppression of early auditory processing across both CI and NH participants, suggesting that while some neural reorganization may have occurred in CI recipients to improve multisensory integrative processing, visual speech ability leads to reduced sensory processing in primary auditory cortex regardless of hearing status. Findings further support behavioral evidence for strong AV integration in CI users and the critical role of vision in improving speech perception.

    Committee: Mark Pitt PhD (Advisor); Antoine Shahin PhD (Committee Member); Aaron Moberly MD (Committee Member); Zeynep Saygin PhD (Committee Member) Subjects: Cognitive Psychology; Psychology
  • 18. Austen, Martha The Role of Listener Experience in Perception of Conditioned Dialect Variation

    Doctor of Philosophy, The Ohio State University, 2020, Linguistics

    Listeners use indexical links—associations between social characteristics and linguistic variants—both to process speech and make social judgments. For example, an American might use a link between southern British speakers and the production of BATH words with [ɑ:] (e.g. bath → [bɑ:θ], glass → [glɑ:s]) both to interpret a production [glɑ:s] as glass rather than gloss when listening to a British speaker, and to judge a production of glass as [glɑ:s] as sounding British. Using the TRAP/BATH split in RP (Britain's prestige dialect) and southern British Englishes, this dissertation examines three facets of these links: first, what linguistic categories (e.g. words or phonemes) are involved; second, how these categories might change as a listener gains more experience with southern British English; and third, whether the same indexical links—involving the same linguistic categories—are used both for linguistic perception (processing speech) and sociolinguistic perception (making social judgments). RP speakers produce BATH words with [ɑ:]—which British listeners perceive as high-status—but TRAP words with [a] or [æ]. This split is phonologically conditioned ([ɑ:] occurs before /f, θ, s/) but has lexical exceptions (e.g. [gas] rather than the expected [gɑ:s]). Northern British and general American English speakers, who lack the split, are hypothesized to link [ɑ:], RP (and related attributes like “high-status”), and a linguistic category from their own dialect: either their phoneme /æ/ (/æ/↔RP [ɑ:]) [or /a/ for northern British speakers, who produce TRAP with [a] rather than the general American [æ]]; /æ/ in a conditioning environment (e.g. /æ/ before voiceless fricatives↔RP [ɑ:]); or individual words (e.g. `bath'↔RP [bɑ:θ]). Listeners with phoneme- level links would incorrectly expect RP speakers to say tr[ɑ:]p; listeners with conditioned-phoneme links would incorrectly expect g[ɑ:]s but correctly expect tr[a]p; and listeners with word-level links would have accurate (open full item for complete abstract)

    Committee: Kathryn Campbell-Kibler (Advisor); Cynthia Clopper (Committee Member); Becca Morley (Committee Member); Shari Speer (Committee Member) Subjects: Linguistics
  • 19. Blankenship, Chelsea Temporal Processing and Speech Perception in Cochlear Implant Recipients and Normal Hearing Listeners

    PhD, University of Cincinnati, 2020, Allied Health Sciences: Communication Sciences and Disorders

    Introduction: Speech recognition performance among cochlear implant (CI) recipients is highly variable and is influenced by their ability to perceive rapid changes within the acoustic signal (i.e., temporal resolution). The most common measure of temporal resolution is a behavioral gap detection test which requires active participation, and therefore may be infeasible for young children and individuals with disabilities. Cortical auditory evoked potentials (CAEPs) have been used as an objective measure of temporal resolution. While within-frequency gap detection (identical pre- and post-gap frequency) is more common, across-frequency (different pre- and post-gap frequency) might be more important for speech understanding. However, limited studies have examined across-frequency gap detection, and none have examined its correlation with speech perception in CI recipients. The purpose of the study is to evaluate behavioral and electrophysiological measures of temporal processing and speech recognition in normal hearing (NH) and CI recipients. Methods: Post-lingually deafened adult CI recipients (n = 11, Mean = 50.4 yrs.) and age- and gender-matched NH individuals (n = 11) were recruited. Speech perception was assessed with the CNC word test, AzBio sentence test, and BKB Speech-in-Noise test. Behavioral gap detection thresholds (GDT) were measured for within-frequency (2 kHz pre- and post-gap tone) and across-frequency conditions (2 kHz to 1 kHz post-gap tone) using an adaptive, two-alternative, forced-choice paradigm. Within- and across-frequency CAEPs were measured using four gap duration conditions ranging from sub-threshold to supra-threshold. Mixed effects models examined group differences in speech perception, behavioral GDTs, and CAEP amplitude and latency. Correlation analyses examined the relationship between the CAEP response, behavioral measures of speech perception and temporal processing, and demographic factors. Results: CI recipients demonstrated s (open full item for complete abstract)

    Committee: Fawen Zhang Ph.D. (Committee Chair); Jeffrey DiGiovanni Ph.D. (Committee Member); Brian Earl Ph.D. (Committee Member); Jareen Meinzen-Derr Ph.D. (Committee Member) Subjects: Audiology
  • 20. Taleb, Nardine Assessing the intelligibility and acoustic changes of time-processed speech

    Master of Arts, Case Western Reserve University, 2020, Psychology

    Three experiments evaluated the impact of time-processed stimuli on speech recognition in noise and in quiet. Results from Experiment 1 indicated decreased masked-speech recognition when stimuli were compressed to 80% of the original duration, though expansion (up to 120% of the original rate) had no impact on performance. Experiment 2 evaluated masked-speech recognition for naturally spoken productions and time-processed speech that varied in rate. Time-processed speech caused a decrease in performance when compared to naturally spoken sentences though matched in rate (for both faster and slower rates). In both experiments, listeners perceived significant distortion when listening in quiet to all rate-altered stimuli. However, the level of noticeable distortion did not correlate with masked-speech recognition ability. Experiment 3 confirmed that, despite audible distortion, time-processed speech presented in quiet was highly intelligible. Results support that masked-speech recognition is negatively impacted by time-processed fast speech when compressed by at least 80%.

    Committee: Lauren Calandruccio Ph.D. (Advisor); Angela Ciccia Ph.D. (Committee Member); Lee Thompson Ph.D. (Committee Member) Subjects: Acoustics