Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 1)

Mini-Tools

 
 

Search Report

  • 1. Yoon, Kyuchul Building a prosodically sensitive diphone database for a Korean text-to-speech synthesis system

    Doctor of Philosophy, The Ohio State University, 2005, Linguistics

    This dissertation describes the design and evaluation of a prosodically sensitive concatenative text-to-speech (TTS) synthesis system for Korean within the Festival TTS framework (Taylor et al., 1998). The primary task that this dissertation undertakes is to build a synthesis system that can test the idea that a speech segment is affected by its prosodic context and is subject to continuous allophonic and categorical allomorphic variation. There are three subtasks to the primary task. The first subtask is to model the allomorphic variation of Korean and to investigate the validity of using hand-written linguistically motivated morphophonological rules in the form of grapheme-to-phoneme (GTP) conversion rules. The evaluation of the implemented GTP module showed that taking advantage of linguistic knowledge could greatly reduce the amount of training material required by any machine-learning approach and that the error analysis is more informative and straightforward. The second subtask is to model positionally-conditioned allophonic variation and to motivate segmental correlates of prosodic categories with a view to designing a prosodically sensitive diphone database. From a corpus of prosodically labeled read speech, we created a prosodically sensitive diphone database, selecting four different prosodic versions of the same diphone. The last subtask is to build a model of Korean prosody, i.e., a model of phrasing, fundamental frequency contour, and duration, using a corpus that has been morpho-syntactically parsed and prosodically labeled following the K-ToBI labeling conventions (Jun, 2000, 1998 & 1993). Only the model of phrasing was implemented, trained from a set of morphosyntactic and textual distance features, and it can predict the location of accentual and intonational phrase breaks. The results of these subtasks were incorporated into the TTS system and the naturalness of the output from the system was evaluated. A listening experiment performed on eighty n (open full item for complete abstract)

    Committee: Mary Beckman (Advisor) Subjects: Language, Linguistics