Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 1)

Mini-Tools

 
 

Search Report

  • 1. Hu, Guoning Monaural speech organization and segregation

    Doctor of Philosophy, The Ohio State University, 2006, Biophysics

    In a natural environment, speech often occurs simultaneously with acoustic interference. Many applications, such as automatic speech recognition and telecommunication, require an effective system that segregates speech from interference in the monaural (one-microphone) situation. While this task of monaural speech segregation has proven to be very challenging, human listeners show a remarkable ability to segregate an acoustic mixture and attend to a target sound, even with one ear. This perceptual process is called auditory scene analysis (ASA). Research in ASA has inspired considerable effort in constructing computational ASA (CASA) based on ASA principles. Current CASA systems, however, face a number of challenges in monaural speech segregation. This dissertation presents a systematic and extensive effort in developing a CASA system for monaural speech segregation that addresses several major challenges. The proposed system consists of four stages: Peripheral analysis, feature extraction, segmentation, and grouping. In the first stage, the system decomposes the auditory scene into a time-frequency representation via bandpass filtering and time windowing. The second stage extracts auditory features corresponding to ASA cues, such as periodicity, amplitude modulation, onset and offset. In the third stage, the system segments an auditory scene based on a multiscale analysis of onset and offset. The last stage includes an iterative algorithm that simultaneously estimates the pitch of a target utterance and segregates the voiced target based on a pitch estimate. Finally, our system sequentially groups voiced and unvoiced portions of the target speech for non-speech interference, and this grouping task is performed using feature-based classification. Systematic evaluation shows that the proposed system extracts a majority of target speech without including much interference. Extensive comparisons demonstrate that the system has substantially advanced the state-of-the-ar (open full item for complete abstract)

    Committee: DeLiang Wang (Advisor); William Masters (Other); Eric Fosler-Lussier (Other) Subjects: