Full metadata record
DC FieldValueLanguage
dc.contributor.authorMunk, Michal
dc.contributor.authorBenko, Ľubomír
dc.contributor.authorGangur, Mikuláš
dc.contributor.authorTurčáni, Milan
dc.identifier.citationE+M. Ekonomie a Management = Economics and Management. 2015, č. 3, s. 144-159.cs
dc.identifier.issn2336-5604 (Online)
dc.identifier.issn1212-3609 (Print)
dc.format16 s.
dc.publisherTechnická univerzita v Libercics
dc.relation.ispartofseriesE+M. Ekonomie a Management = Economics and Managementcs
dc.rights© Technická univerzita v Libercics
dc.rightsCC BY-NC 4.0cs
dc.subjectweb usage miningcs
dc.subjectpředzpracování datcs
dc.subjectpomocné stránkycs
dc.subjectreferenční délkacs
dc.subjectzaevidované souborycs
dc.subjectobchodní inteligencecs
dc.subjectdata miningcs
dc.titleInfluence of ratio of auxiliary pages on the pre-processing phase of web usage miningen
dc.description.abstract-translatedData mining belongs to the one of the important tools for Business Intelligence. It is a means to increase competitiveness of a company. Web usage mining is engaged in data mining of web server log file and it analyzes the user´s behavior on the web site. The first step of web usage mining process is data pre-processing obtained from a web log file. Data pre-processing is an important part of web usage mining. Discovering patterns of behavior of web visitors depends on the quality of pre-processing phase. Therefore it is important to understand the used methods. This paper summarizes the pre-processing phases and especially the phases of session identification. There are introduced two algorithms for data cleaning and session identification using the reference length method. The main aim of this paper is to compare a calculation of cutoff time and its influence on discovered useful, trivial and inexplicable rules. Cutoff time is an important part of the session identification using the Reference Length method. The influence of ratio of auxiliary pages on the calculation based on a sitemap and subjective estimation was compared. Statistical methods were used to determine the difference between these two approaches. In this paper was examined the portion of found rules based on quantity and quality. The ratio of auxiliary pages has only an impact on quantity of extracted rules in the files with path completion. It has no impact on portion of extracted useful rules, on the other hand, inappropriate estimation of the ratio of auxiliary pages may cause increasing of trivial and inexplicable rules.en
dc.subject.translatedweb usage miningen
dc.subject.translateddata pre-processingen
dc.subject.translatedauxiliary pagesen
dc.subject.translatedreference lengthen
dc.subject.translatedlog filesen
dc.subject.translatedbusiness intelligenceen
dc.subject.translateddata miningen
Appears in Collections:Číslo 3 (2015)
Články / Articles (KEM)
Číslo 3 (2015)

Files in This Item:
File Description SizeFormat 
13_INFLUENCE OF RATIO OF AUXILIARY.pdfPlný text1,43 MBAdobe PDFView/Open

Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/17632

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.