Automatická klasifikace textových dokumentů

Černá, Veronika

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.advisor	Král, Pavel
dc.contributor.author	Černá, Veronika
dc.date.accepted	2012-06-07
dc.date.accessioned	2013-06-19T06:42:08Z	-
dc.date.available	2012-02-03	cs
dc.date.available	2013-06-19T06:42:08Z	-
dc.date.issued	2012
dc.date.submitted	2012-05-11
dc.identifier	49895
dc.identifier.uri	http://hdl.handle.net/11025/5497
dc.description.abstract	Tato práce se zabývá automatickou klasifikací textových dokumentů, jejímž cílem je přiřadit dokumentům kategorii z nějaké předdefinované množiny kategorií. Pro experimenty byly vybrány tři známé klasifikační techniky: naivní Bayesův klasifikátor, support vector machines a maximální entropie. K předzpracování dat byly použity lemmatizátor a POS-tagger a na základě různých kritérií pro výběr příznaků pak byly vytvořeny 4 sady dokumentů. Všechny experimenty byly prováděny na českém korpusu nástrojem MinorThird.	cs
dc.format	47 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	cs	cs
dc.publisher	Západočeská univerzita v Plzni	cs
dc.relation.isreferencedby	https://portal.zcu.cz/StagPortletsJSR168/CleanUrl?urlid=prohlizeni-prace-detail&praceIdno=49895	-
dc.rights	Plný text práce je přístupný bez omezení.	cs
dc.subject	klasifikace dokumentů	cs
dc.subject	naivní Bayesův klasifikátor	cs
dc.subject	support vector machines	cs
dc.subject	maximální entropie	cs
dc.title	Automatická klasifikace textových dokumentů	cs
dc.title.alternative	Automatic Text Document Classification	en
dc.type	bakalářská práce	cs
dc.thesis.degree-name	Bc.	cs
dc.thesis.degree-level	Bakalářský	cs
dc.thesis.degree-grantor	Západočeská univerzita v Plzni. Fakulta aplikovaných věd	cs
dc.description.department	Katedra informatiky a výpočetní techniky	cs
dc.thesis.degree-program	Inženýrská informatika	cs
dc.description.result	Obhájeno	cs
dc.rights.access	openAccess	en
dc.description.abstract-translated	This work deals with automatic text document classification. Text classification is a process of labelling documents with thematic categories from a predefined set of categories. Three known classification techniques were chosen for experiments in this work: naive Bayes, support vector machines and maximum entropy. A lemmatizer and a POS-tagger were used for the text pre-processing. Four sets of documents were created based on the different feature selection criteria. All experiments were performed on the Czech corpus using the MinorThird toolkit.	en
dc.subject.translated	document classification	en
dc.subject.translated	naive Bayes	en
dc.subject.translated	support vector machines	en
dc.subject.translated	maximum entropy	en
Vyskytuje se v kolekcích:	Bakalářské práce / Bachelor´s works (KIV)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
Automaticka klasifikace textovych dokumentu.pdf	Plný text práce	462,87 kB	Adobe PDF	Zobrazit/otevřít
A10B0834Pposudek.pdf	Posudek vedoucího práce	438,15 kB	Adobe PDF	Zobrazit/otevřít
A10B0834Pprubeh.pdf	Průběh obhajoby práce	53,01 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/5497

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace