Language identification of individual words with joint sequence models

Giwa, Oluwapelumi; Davel, Marelie H.

Language identification of individual words with joint sequence models

dc.contributor.author	Giwa, Oluwapelumi
dc.contributor.author	Davel, Marelie H.
dc.date.accessioned	2018-03-05T08:00:27Z
dc.date.available	2018-03-05T08:00:27Z
dc.date.issued	2014
dc.description.abstract	Within a multilingual automatic speech recognition (ASR) system, knowledge of the language of origin of unknown words can improve pronunciation modelling accuracy. This is of particular importance for ASR systems required to deal with codeswitched speech or proper names of foreign origin. For words that occur in the language model, but do not occur in the pronunciation lexicon, text-based language identification (T-LID) of a single word in isolation may be required. This is a challenging task, especially for short words. We motivate for the importance of accurate T-LID in speech processing systems and introduce a novel way of applying Joint Sequence Models to the T-LID task. We obtain competitive results on a real-world 4- language task: for our best JSM system, an F-measure of 97:2% is obtained, compared to a F-measure of 95:2% obtained with a state-of-the-art Support Vector Machine (SVM).	en_US
dc.description.sponsorship	This work was supported by the South African Department of Arts and Culture (DAC) and the National Research Foundation (NRF). Any opinion, findings and conclusions or recommendations expressed in this material are those of the author(s) and therefore neither DAC nor the NRF accepts any liability in regard thereto.	en_US
dc.identifier.citation	Oluwapelumi Giwa and Marelie H. Davel, “Language identification of individual words with joint sequence models”, in Proc. Interspeech, pp 1400-1404, Singapore, 2014. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications]	en_US
dc.identifier.uri	http://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1400.pdf
dc.identifier.uri	https://www.semanticscholar.org/paper/Language-identification-of-individual-words-with-j-Giwa-Davel/81b920eab1cb2f82e63d2c0980e8225f241e2cab
dc.identifier.uri	http://hdl.handle.net/10394/26497
dc.language.iso	en	en_US
dc.publisher	Interspeech 2014	en_US
dc.subject	Text-based language identification	en_US
dc.subject	Joint sequence models	en_US
dc.subject	Multilingual speech recognition	en_US
dc.title	Language identification of individual words with joint sequence models	en_US
dc.type	Presentation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: giwa-2014-language-identification.pdf
Size:: 345.39 KB
Format:: Adobe Portable Document Format
Description:: giwa-2014-language-identification

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Engineering