Czech Speech Synthesizer Popokatepetl Based On Word Corpus

Gaura,  Pavel

Czech Speech Synthesizer Popokatepetl Based On Word Corpus

Warning

This publication doesn't include Faculty of Education. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	GAURA Pavel
Year of publication	2003
Type	Article in Proceedings
Conference	Proceedings of EC-VIP-MC 2003
MU Faculty or unit	Faculty of Informatics
Citation
Field	Informatics
Keywords	speech synthesis; Popokatepetl; speech corpus;speech segmentation; speech processing; Audis; REMathEx
Description	At the Faculty of Informatics, Masaryk University, Brno we developed the AUDIS system recently. It has been developed primarily as a multimodal support that would help visually impaired students to study various materials. For proper functionality of the system inputs and outputs, we need also high quality speech synthesis. Unfortunately, it is not available for Czech language. Therefore, we are developing a speech engine that allows us to produce high quality Czech speech for some limited domains together with the average quality of general Czech speech synthesis (where average means well comprehensible). Limited domain speech synthesis will be used for frequently used speech outputs (e.g. navigation in a document, control of the system), while general speech synthesis will be available for common text. For these purposes we have developed the automatic recording system that allows us to collect and process the large amount of speech data. The basic principles of our speech synthesis, the recording system and the speech segments selection and processing are described in the first part of the paper. The second part of the paper deals with methods for choosing the best set of speech data to be recorded into the corpus and the speech data segmentation.
Related projects:	Human-computer interaction, dialog systems and assistive technologies