The NCHLT Speech Corpus of the South African languages

Barnard, Etienne; Davel, Marelie H.; van Heerden, Charl; De Wet, Febe; Badenhorst, Jaco

The NCHLT Speech Corpus of the South African languages

Files

barnard-2014-speech-corpus.pdf (652.21 KB)

Date

2014

Authors

Barnard, Etienne

Davel, Marelie H.

van Heerden, Charl

De Wet, Febe

Badenhorst, Jaco

Publisher

Workshop Spoken Language Technologies for Under-resourced Languages (SLTU)

Abstract

The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven official languages of South Africa. We describe the design and development processes that were undertaken in order to develop the corpus, and report on associated materials such as orthographic transcriptions and pronunciation dictionaries that were released as part of the corpus. In order to benchmark speech recognition performance on the corpus, we have also developed both phone-recognition and word-recognition systems for all eleven languages; we find that high accuracies can be achieved for these speaker-independent but vocabulary-dependent recognition tasks in all languages.

Description

This work was supported by the Department of Arts and Culture.

Keywords

Speech Corpus, South African languages, Speech recognition, wword-recognition, phone-recognition

Citation

E. Barnard, M. H. Davel, C. van Heerden, F. de Wet and J. Badenhorst, “The NCHLT Speech Corpus of the South African languages”, in Proc. Int. Workshop Spoken Language Technologies for Under-resourced Languages (SLTU), pp 194-200, St Petersburg, Russia, 2014. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications]

URI

https://researchspace.csir.co.za/dspace/handle/10204/7549
http://mica.edu.vn/sltu2014/proceedings/28.pdf
http://hdl.handle.net/10394/26493

Collections

Faculty of Engineering
Faculty of Natural and Agricultural Sciences

Full item page

The NCHLT Speech Corpus of the South African languages

Files

Date

Authors

Researcher ID

Supervisors

Journal Title

Journal ISSN

Volume Title

Publisher

Record Identifier

Abstract

Sustainable Development Goals

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By