Full metadata record
DC FieldValueLanguage
dc.contributor.authorMinárik, Matej
dc.contributor.authorBurget, Radek
dc.contributor.editorSteinberger, Josef
dc.contributor.editorZíma, Martin
dc.contributor.editorFiala, Dalibor
dc.contributor.editorDostal, Martin
dc.contributor.editorNykl, Michal
dc.date.accessioned2017-10-09T12:39:34Z
dc.date.available2017-10-09T12:39:34Z
dc.date.issued2017
dc.identifier.citationSTEINBERGER, Josef ed.; ZÍMA, Martin ed.; FIALA, Dalibor ed.; DOSTAL, Martin ed.; NYKL, Michal ed. Data a znalosti 2017: sborník konference, Plzeň, Hotel Angelo 5. - 6. října 2017. 1. vyd. Plzeň: Západočeská univerzita v Plzni, 2017, s. 227-231. ISBN 978-80-261-0720-0.cs
dc.identifier.isbn978-80-261-0720-0
dc.identifier.urihttps://www.zcu.cz/export/sites/zcu/pracoviste/vyd/online/DataAZnalosti2017.pdf
dc.identifier.urihttp://hdl.handle.net/11025/26368
dc.format5 s.cs
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherZápadočeská univerzita v Plznics
dc.rights© Západočeská univerzita v Plznics
dc.subjectintegrace webových datcs
dc.subjectextrakce informacícs
dc.subjectstrukturovaná extrakce záznamůcs
dc.subjectsegmentace stránekcs
dc.subjectklasifikace obsahucs
dc.subjectmapování ontologiícs
dc.titleInformation extraction from the web by matching visual presentation patternsen
dc.typekonferenční příspěvekcs
dc.typeconferenceObjecten
dc.rights.accessopenAccessen
dc.type.versionpublishedVersionen
dc.description.abstract-translatedThere is a large amount of data available on the Web. Data are often represented as text, enriched with tables, lists, images or other visual structures. These data are usually coded in HTML without any additional semantics, which makes them nigh impossible to automatically process and extract. There are ap-proaches based on top-down document segmentation according to visual infor-mation and layout. We present a bottom-up approach which starts with the smallest consistent elements and matches the visual relationships among these elements to a pre-defined ontological structure of extracted records. This meth-od considers not only the visual attributes of a particular segment, but also its position amongst other segments.en
dc.subject.translatedweb data integrationen
dc.subject.translatedinformation extractionen
dc.subject.translatedstructured record extractionen
dc.subject.translatedpage segmentationen
dc.subject.translatedcontent classificationen
dc.subject.translatedontology mappingen
dc.type.statusPeer-revieweden
Appears in Collections:Data a znalosti 2017
Data a znalosti 2017

Files in This Item:
File Description SizeFormat 
Minarik.pdfPlný text376,7 kBAdobe PDFView/Open


Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/26368

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.