Score normalization methods applied to topic identification

Skorkovská, Lucie; Zajíc, Zbyněk

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Skorkovská, Lucie
dc.contributor.author	Zajíc, Zbyněk
dc.date.accessioned	2015-12-17T10:53:38Z
dc.date.available	2015-12-17T10:53:38Z
dc.date.issued	2014
dc.identifier.citation	SKORKOVSKÁ, Lucie; ZAJÍC, Zbyněk. Score normalization methods applied to topic identification. In: Text, speech and dialogue. Berlin: Springer, 2014, p. 133-140. (Lecture notes in computer science; 8655). ISBN 978-3-319-10815-5.	en
dc.identifier.isbn	978-3-319-10815-5
dc.identifier.uri	http://www.kky.zcu.cz/cs/publications/LucieSkorkovska_2014_ScoreNormalization
dc.identifier.uri	http://hdl.handle.net/11025/17046
dc.format	8 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer	en
dc.relation.ispartofseries	Lecture notes in computer science; 8655	en
dc.rights	© Lucie Skorkovská - Zbyněk Zajíc	cs
dc.subject	identifikace tématu	cs
dc.subject	multi-label klasifikace textu	cs
dc.subject	naivní bayesovská klasifikace	cs
dc.subject	normalizace skóre	cs
dc.title	Score normalization methods applied to topic identification	en
dc.title.alternative	Metody normalizace skóre použité pro identifikaci tématu	cs
dc.type	článek	cs
dc.type	article	en
dc.rights.access	openAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	Multi-label classification plays the key role in modern categorization systems. Its goal is to find a set of labels belonging to each data item. In the multi-label document classification unlike in the multi-class classification, where only the best topic is chosen, the classifier must decide if a document does or does not belong to each topic from the predefined topic set. We are using the generative classifier to tackle this task, but the problem with this approach is that the threshold for the positive classification must be set. This threshold can vary for each document depending on the content of the document (words used, length of the document, ...). In this paper we use the Unconstrained Cohort Normalization, primary proposed for speaker identification/verification task, for robustly finding the threshold defining the boundary between the correct and the incorrect topics of a document. In our former experiments we have proposed a method for finding this threshold inspired by another normalization technique called World Model score normalization. Comparison of these normalization methods has shown that better results can be achieved from the Unconstrained Cohort Normalization.	en
dc.subject.translated	topic identification	en
dc.subject.translated	multi-label text classification	en
dc.subject.translated	naive bayes classification	en
dc.subject.translated	score normalization	en
dc.identifier.doi	10.1007/978-3-319-10816-2_17
dc.type.status	Peer-reviewed	en
Vyskytuje se v kolekcích:	Články / Articles (NTIS)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
LucieSkorkovska_2014_ScoreNormalization.pdf	Plný text	188,76 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17046

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace