Full metadata record
DC poleHodnotaJazyk
dc.contributor.authorKolář, Jáchym
dc.contributor.authorŠvec, Jan
dc.date.accessioned2016-01-08T07:03:40Z
dc.date.available2016-01-08T07:03:40Z
dc.date.issued2009
dc.identifier.citationKOLÁŘ, Jáchym; ŠVEC, Jan. The czech broadcast conversation corpus. In: Text, speech and dialogue. Berlin: Springer, 2009, p. 101-108. (Lectures notes in computer science; 5729). ISBN 978-3-642-04207-2.en
dc.identifier.isbn978-3-642-04207-2
dc.identifier.urihttp://www.kky.zcu.cz/cs/publications/JachymKolar_2009_TheCzechBroadcast
dc.identifier.urihttp://hdl.handle.net/11025/17175
dc.format9 s.cs
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherSpringeren
dc.relation.ispartofseriesLectures notes in computer science; 5729en
dc.rights© Jáchym Kolář - Jan Šveccs
dc.subjectrozhlasové zprávycs
dc.subjectrozpoznávání řečics
dc.subjectlingvistická analýzacs
dc.titleThe czech broadcast conversation corpusen
dc.typečlánekcs
dc.typearticleen
dc.rights.accessopenAccessen
dc.type.versionpublishedVersionen
dc.description.abstract-translatedThis paper presents the final version of the Czech Broadcast Conversation Corpus that will shortly be released at the Linguistic Data Consortium (LDC). The corpus contains 72 recordings of a radio discussion program, which yields about 33 hours of transcribed conversational speech from 128 speakers. The release does not only include verbatim transcripts and speaker information, but also structural metadata (MDE) annotation that involves labeling of sentence-like unit boundaries, marking of non-content words like filled pauses and discourse markers, and annotation of speech disfluencies. The MDE annotation is based on the LDC's annotation standard for English, with changes applied to accommodate phenomena that are specific for Czech. In addition to its importance to speech recognition, speaker diarization, and structural metadata extraction research, the corpus is also useful for linguistic analysis of conversational Czech.en
dc.subject.translatedbroadcast newsen
dc.subject.translatedspeech recognitionen
dc.subject.translatedlinguistic analysisen
dc.type.statusPeer-revieweden
Vyskytuje se v kolekcích:Články / Articles (KKY)

Soubory připojené k záznamu:
Soubor Popis VelikostFormát 
JachymKolar_2009_TheCzechBroadcast.pdfPlný text179,85 kBAdobe PDFZobrazit/otevřít


Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17175

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.