Vystadial

Description

Vystadial 2013 is a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts.  The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions. The scripts implement data pre-processing and building acoustic models using the HTK and Kaldi toolkits.

Tool type

Resources

Tool task

data, corpus

Key words

data, spoken corpus, speech processing, multi-lingual

Research domain

Computational Linguistics, Linguistics, Speech Recognition

Language

Czech, English

Country

Czech

CLARIN centre

Charles University in Prague

Contact person

Matěj Korvas

URL

https://ufal.mff.cuni.cz/grants/vystadial

Similar to

Spokes