DOI:
10.1609/hcomp.v1i1.13102
Abstract:
Large speech corpora with word-level transcriptions annotated for noises and disfluent speech are necessary for training automatic speech recognisers. Crowdsourcing is a lower-cost, faster-turnaround, highly scalable alternative for expert transcription and annotation. In this paper, we showcase our three-step crowdsourcing approach motivated by the importance of accurate transcriptions and annotations.