Title: | Training Strategies for OCR Systems for Historical Documents |
Other Titles: | Strategie trénování OCR systému pro historické dokumenty |
Authors: | Martínek, Jiří Lenc, Ladislav Král, Pavel |
Citation: | HRUDA, L., DVOŘÁK, J., VÁŠA, L. On evaluating consensus in RANSAC surface registration. Computer Graphics Forum, 2019, roč. 38, č. 5, s. 175-186. ISSN 1467-8659. |
Issue Date: | 2019 |
Publisher: | Blackwell Publishing |
Document type: | článek article |
URI: | 2-s2.0-85070444843 http://hdl.handle.net/11025/35853 |
ISSN: | 1868-4238 |
Keywords: | Konvoluční neuronová sí;, historické dokumenty;Long Short-Term Memory;Neuronová síť;optické rozpoznávání znaků;syntetická data |
Keywords in different language: | Convolutional Neural Network;Historical documents;Long Short-Term Memory;Neural Network;Optical Character Recognition;Synthetic data |
Abstract: | Tento článek prezentuje přehled trénovacích strategií pro systém optického rozpoznávání znaků v historických dokumentech. Hlavní problém je kvalita anotovaných dat a jejich nedostatek. Dále shrneme několik způsobů vytváření syntetických dat. Hlavním úkolem článku je ukázat a porovnat různé možnosti kombinace syntetických a anotovaných dat pro trénování konvoluční rekurentní neuronové sítě. |
Abstract in different language: | Random Sample Consensus is a powerful paradigm that was successfully applied in various contexts, including Location Determination Problem, fundamental matrix estimation and global 3D surface registration, where many previously proposed algorithms can be interpreted as a particular implementation of this concept. In general, a set of candidate transformations is generated by some simple procedure, and an aligning transformation is chosen within this set, such that it aligns the largest portion of the input data. We observe that choosing the aligning transformation may also be interpreted as finding consensus among the candidates, which in turn involves measuring similarity of candidate rigid transformations. While it is not difficult to construct a metric that provides reasonable results, most approaches come with certain limitations and drawbacks. In this paper, we investigate possible means of measuring distances in SE(3) and compare their properties both theoretically and experimentally in a model RANSAC registration algorithm. We also propose modifications to existing measures and propose a novel method of locating the consensus transformation based on Vantage Point Tree data structure. |
Rights: | Plný text je přístupný v rámci univerzity přihlášeným uživatelům. © Blackwell Publishing |
Appears in Collections: | Články / Articles (NTIS) Články / Articles (KIV) OBD |
Files in This Item:
File | Size | Format | |
---|---|---|---|
aiai_2019_jiri_Martinek 2.pdf | 1,52 MB | Adobe PDF | View/Open Request a copy |
Please use this identifier to cite or link to this item:
http://hdl.handle.net/11025/35839
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.