dc.contributor.author | Van Heerden, Carel J. | |
dc.contributor.author | Barnard, Etienne | |
dc.contributor.author | Davel, Marelie H. | |
dc.date.accessioned | 2015-03-31T06:06:46Z | |
dc.date.available | 2015-03-31T06:06:46Z | |
dc.date.issued | 2012 | |
dc.identifier.citation | Davel, M.H. & Van Heerden, C.J., et al. 2012. Validating smartphone-collected speech corpora. In: International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012. | en_US |
dc.identifier.issn | 978-1-86822-615-3 | |
dc.identifier.uri | http://hdl.handle.net/10394/13631 | |
dc.description | International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), Cape Town, South Africa, 7-9 May 2012 | en_US |
dc.description.abstract | We investigate the effectiveness with which the accuracy of a prompted speech corpus can be validated when minimal additional speech resources are available, and specifically when a language model in the target language is not available. We compare a word-based variant of Goodness of Pronunciation (GOP) with a phone-based dynamic programming (PDP) scoring technique. The first technique uses the acoustic likelihood ratio and the second the optimal alignment between an observed phone string (generated by a speech recogniser) and a reference phone string (obtained from a dictionary) to generate validation scores. We define a new technique to obtain a PDP scoring matrix in a data-driven fashion, examine different ways of using GOP for word scoring, and find that variants of both techniques provide results that are effective for corpus validation. | en_US |
dc.description.uri | http://www.mica.edu.vn/sltu | |
dc.description.uri | http://www.mica.edu.vn/sltu2012/index.php?pid=l2#listOfPapers | |
dc.language.iso | en | en_US |
dc.publisher | SLTU | en_US |
dc.subject | Speech corpora | en_US |
dc.subject | Corpus validation | en_US |
dc.subject | Goodness of pronunciation | en_US |
dc.subject | Phone-based dynamic programming scores | en_US |
dc.title | Validating smartphone-collected speech corpora | en_US |
dc.type | Other | en_US |
dc.contributor.researchID | 23607955 - Davel, Marelie Hattingh | |
dc.contributor.researchID | 11539151 - Van Heerden, Carel Jacobus | |
dc.contributor.researchID | 21021287 - Barnard, Etienne | |