The Utilization of Contexts in Limited Domain Speech Synthesis
JŮZOVÁ, M. The Utilization of Contexts in Limited Domain Speech Synthesis. In 22nd Czech-German Workshop on Speech Communication - Book of Abstract. 2014. s. 17-18.
|Druh:||STAŤ VE SBORNÍKU|
|Anglický název:||The Utilization of Contexts in Limited Domain Speech Synthesis|
|Autoři:||Ing. Markéta Jůzová ,|
|Abstrakt EN:||While synthesizing a given sentence, parts of a real recorded speech (speech units) are concatenated together. When we prepare a text corpus for a general text-to-speech (TTS) system, we try to select texts containing as many speech units (e.g. diphones) as possible, regarding their prosodic and phonetic context. However, for the purpose of the limited-domain text-to-speech synthesis (LDTS), the corpus preparation is different, because a limited-domain (LD) corpus should ensure 100% coverage of the given domain. During the synthesis itself, longer speech units (like words or phrases) are concatenated. The concatenations are usually done in pauses, which the two concatenated units were expanded by, but it is unnatural and synthesized sentences do not sound fluently. In our research, we try to use word contexts in LDTS system to improve the synthesis and we compare its quality to the quality of a general TTS system.|