Automatická identifikace revizí textových dokumentů

Kupilík, Filip

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.advisor	Konopík Miloslav, Ing. Ph.D.
dc.contributor.author	Kupilík, Filip
dc.contributor.referee	Král Pavel, Doc. Ing. Ph.D.
dc.date.accepted	2017-8-29
dc.date.accessioned	2018-01-15T15:04:45Z	-
dc.date.available	2016-10-10
dc.date.available	2018-01-15T15:04:45Z	-
dc.date.issued	2017
dc.date.submitted	2017-6-29
dc.identifier	71953
dc.identifier.uri	http://hdl.handle.net/11025/27699
dc.description.abstract	Cílem této práce je navrhnout, vytvořit a otestovat algoritmus pro identifi- kaci revizí v množině textových dokumentů. V první části práce jsou zmapo- vány současné přístupy ve vyhledávání dokumentů a popsány stávající algo- ritmy pro identifikaci podobných dokumentů. Druhá část se zabývá návrhem a implementací algoritmu zaměřeného na detekci revizí, jehož úspěšnost je ověřena na vytvořené kolekci testovacích dokumentů. Výsledky získané z provedených experimentů jsou porovnány s výsledky vybraných stávajících algoritmů.	cs
dc.format	49 s. (71189 znaků)	cs
dc.format.mimetype	application/pdf
dc.language.iso	cs	cs
dc.publisher	Západočeská univerzita v Plzni	cs
dc.rights	Plný text práce je přístupný bez omezení.	cs
dc.subject	revize	cs
dc.subject	duplikáty	cs
dc.subject	apache lucene	cs
dc.subject	vyhledávání informací	cs
dc.subject	vektorový model	cs
dc.subject	kullback-leiblerova divergence	cs
dc.subject	rozpoznávání pojmenovaných entit	cs
dc.title	Automatická identifikace revizí textových dokumentů	cs
dc.title.alternative	Automated Identification of Revisions of Text Documents	en
dc.type	bakalářská práce	cs
dc.thesis.degree-name	Bc.	cs
dc.thesis.degree-level	Bakalářský	cs
dc.thesis.degree-grantor	Západočeská univerzita v Plzni. Fakulta aplikovaných věd	cs
dc.thesis.degree-program	Inženýrská informatika	cs
dc.description.result	Obhájeno	cs
dc.rights.access	openAccess	en
dc.description.abstract-translated	The goal of the thesis is to design, create and test an algorithm which iden- tifies the revisions of test documents. The first part of the thesis is focused on analysing current approaches to document searching and a identification of highly similar documents (near-duplicates). The second part deals with a design and an implementation of a new algorithm. The efficiency of the algorithm is verified on a set of test documents. The tests results are com- pared with the results of the experiments which were done with the selected existing algorithms.	en
dc.subject.translated	revision	en
dc.subject.translated	duplicates	en
dc.subject.translated	apache lucene	en
dc.subject.translated	information retrieval	en
dc.subject.translated	vector space model	en
dc.subject.translated	kullback-leibler divergence	en
dc.subject.translated	named-entity recognition	en
Vyskytuje se v kolekcích:	Bakalářské práce / Bachelor´s works (KIV)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
bakalarka.pdf	Plný text práce	532,26 kB	Adobe PDF	Zobrazit/otevřít
A14B0296P-hodnoceni.pdf	Posudek vedoucího práce	620,39 kB	Adobe PDF	Zobrazit/otevřít
A14B0296P-posudek.pdf	Posudek oponenta práce	435,01 kB	Adobe PDF	Zobrazit/otevřít
A14B0296P-obhajoba.pdf	Průběh obhajoby práce	200,48 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/27699

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace