Title: Automatic Correction of i/y Spelling in Czech ASR Output
Authors: Švec, Jan
Lehečka, Jan
Šmídl, Luboš
Ircing, Pavel
Citation: ŠVEC, J. LEHEČKA, J. ŠMÍDL, L. IRCING, P. Automatic Correction of i/y Spelling in Czech ASR Output. In: Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings. Cham: Springer, 2020. s. 321-330. ISBN 978-3-030-58322-4, ISSN 0302-9743.
Issue Date: 2020
Publisher: Springer
Document type: konferenční příspěvek
conferenceObject
URI: 2-s2.0-85091182120
http://hdl.handle.net/11025/43118
ISBN: 978-3-030-58322-4
ISSN: 0302-9743
Keywords in different language: Grammatical error correction, ASR , BERT
Abstract in different language: This paper concentrates on the design and evaluation of the method that would be able to automatically correct the spelling of i/y in the Czech words at the output of the ASR decoder. After analysis of both the Czech grammar rules and the data, we have decided to deal only with the endings consisting of consonants b/f/l/m/p/s/v/z followed by i/y in both short and long forms. The correction is framed as the classification task where the word could belong to the “i” class, the “y” class or the “empty” class. Using the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) architecture, we were able to substantially improve the correctness of the i/y spelling both on the simulated and the real ASR output. Since the misspelling of i/y in the Czech texts is seen by the majority of native Czech speakers as a blatant error, the corrected output greatly improves the perceived quality of the ASR system.
Rights: Plný text není přístupný.
© Springer
Appears in Collections:Konferenční příspěvky / Conference papers (NTIS)
Konferenční příspěvky / Conference Papers (KKY)
OBD

Files in This Item:
File SizeFormat 
Švec2020_Chapter_AutomaticCorrectionOfIYSpellin.pdf251,46 kBAdobe PDFView/Open    Request a copy


Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/43118

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

search
navigation
  1. DSpace at University of West Bohemia
  2. Publikační činnost / Publications
  3. OBD