Modeling Prosody Automatically in Concept-to-Speech Generation

Shimei Pan, Columbia University

My thesis emphasizes investigation and establishment of systematic methodologies for automatic prosody modeling using corpus analysis. Prosody modeling in most previous CTS systems employs handcrafted rules, with little evaluation of the overall performance of the rules. By systematically employing different machine learning techniques on a speech corpus, I am able to automatically model prosody for a given domain. Another focus of my thesis is on system architecture. There are two concerns when designing a CTS system: modularity and extensibility. The goal is to design a exible CTS system so that new prosody generators, natural language generators and speech realization systems can be incorporated without requiring major changes to the existing system. Designing a CTS system to facilitate multimedia synchronization is another focus of this research.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.