NWU Institutional Repository

Semi-Supervised Training for Lecture Transcription in Resource-Scarce Environments

dc.contributor.authorDe Villiers, Pieter
dc.contributor.authorBarnard, Etienne
dc.contributor.authorvan Heerden, Charl J.
dc.contributor.authorJooste, Petri
dc.date.accessioned2018-03-05T07:50:11Z
dc.date.available2018-03-05T07:50:11Z
dc.date.issued2014
dc.description.abstractWe present a study where standard semi-supervised training methods are applied in a resource-scarce environment to build lecture transcription systems. Experiments are conducted on two different corpora which one can expect to be available in resource-scarce environments. These include 1) speaker- and domain-specific data where a single South African English lecturer presents the “Operating Systems” course, and 2) Afrikaans speaker-independent and domain non-specific data collected from science and law courses. Different amounts of acoustic and language model data are used for training the respective models. We find that lecture transcription systems in resource-scarce environments can benefit substantially from semi-supervised training methods. We also describe a small, new corpus of spoken lectures which is freely available in the public domain.en_US
dc.description.sponsorshipMultilingual Speech Technologies Group, North-West University, Vanderbijlpark 1900, South Africaen_US
dc.identifier.citationPieter De Villiers, Etienne Barnard, Charl Van Heerden and Petri Jooste, “Semi-Supervised Training for Lecture Transcription in Resource-Scarce Environments”, in Proc. Annual Symp. Pattern Recognition Association of South Africa (PRASA), pp 7-12, Cape Town, South Africa, 2014. [http://engineering.nwu.ac.za/multilingual-speech-technologies-must/publications]en_US
dc.identifier.urihttps://www.semanticscholar.org/paper/Semi-Supervised-Training-for-Lecture-Transcription-Villiers-Barnard/410b436b1193c5b394f7905e3f0ebcda06edd19e
dc.identifier.urihttp://hdl.handle.net/10394/26496
dc.language.isoenen_US
dc.publisherPattern Recognition Association of South Africa and Mechatronics International Conferenceen_US
dc.subjectLecture Transcriptionen_US
dc.subjectKaldien_US
dc.subjectSemi-superviseden_US
dc.subjectLanguage Modelen_US
dc.subjectResource-scarceen_US
dc.titleSemi-Supervised Training for Lecture Transcription in Resource-Scarce Environmentsen_US
dc.typePresentationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
devilliers-2014-lecture-transcription.pdf
Size:
95.42 KB
Format:
Adobe Portable Document Format
Description:
devilliers-2014-lecture-transcription

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: