Semi-Supervised Training for Lecture Transcription in Resource-Scarce Environments
| dc.contributor.author | De Villiers, Pieter | |
| dc.contributor.author | Barnard, Etienne | |
| dc.contributor.author | van Heerden, Charl J. | |
| dc.contributor.author | Jooste, Petri | |
| dc.date.accessioned | 2018-03-05T07:50:11Z | |
| dc.date.available | 2018-03-05T07:50:11Z | |
| dc.date.issued | 2014 | |
| dc.description.abstract | We present a study where standard semi-supervised training methods are applied in a resource-scarce environment to build lecture transcription systems. Experiments are conducted on two different corpora which one can expect to be available in resource-scarce environments. These include 1) speaker- and domain-specific data where a single South African English lecturer presents the “Operating Systems” course, and 2) Afrikaans speaker-independent and domain non-specific data collected from science and law courses. Different amounts of acoustic and language model data are used for training the respective models. We find that lecture transcription systems in resource-scarce environments can benefit substantially from semi-supervised training methods. We also describe a small, new corpus of spoken lectures which is freely available in the public domain. | en_US |
| dc.description.sponsorship | Multilingual Speech Technologies Group, North-West University, Vanderbijlpark 1900, South Africa | en_US |
| dc.identifier.citation | Pieter De Villiers, Etienne Barnard, Charl Van Heerden and Petri Jooste, “Semi-Supervised Training for Lecture Transcription in Resource-Scarce Environments”, in Proc. Annual Symp. Pattern Recognition Association of South Africa (PRASA), pp 7-12, Cape Town, South Africa, 2014. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications] | en_US |
| dc.identifier.uri | https://www.semanticscholar.org/paper/Semi-Supervised-Training-for-Lecture-Transcription-Villiers-Barnard/410b436b1193c5b394f7905e3f0ebcda06edd19e | |
| dc.identifier.uri | http://hdl.handle.net/10394/26496 | |
| dc.language.iso | en | en_US |
| dc.publisher | Pattern Recognition Association of South Africa and Mechatronics International Conference | en_US |
| dc.subject | Lecture Transcription | en_US |
| dc.subject | Kaldi | en_US |
| dc.subject | Semi-supervised | en_US |
| dc.subject | Language Model | en_US |
| dc.subject | Resource-scarce | en_US |
| dc.title | Semi-Supervised Training for Lecture Transcription in Resource-Scarce Environments | en_US |
| dc.type | Presentation | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- devilliers-2014-lecture-transcription.pdf
- Size:
- 95.42 KB
- Format:
- Adobe Portable Document Format
- Description:
- devilliers-2014-lecture-transcription
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.61 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
