Tiny CNN for feature point description for document analysis: approach and dataset
A. Sheshkus 1,2,3, A. Chirvonaya 3,4, V.L. Arlazarov 2,3

Moscow Institute for Physics and Technology, 141701, Russia, Moscow Region, Dolgoprudny, Institutskiy per., 9;
Institute for Systems Analysis, Federal Research Center "Computer Science and Control"
of Russian Academy of Sciences, 117312, Moscow, Russia, pr. 60-letiya Oktyabrya, 9;
Smart Engines Service LLC, 117312, Moscow, Russia, pr. 60-letiya Oktyabrya, 9;
National University of Science and Technology "MISIS", 119049, Moscow, Russia, Leninskiy prospect, 4

DOI: 10.18287/2412-6179-CO-1016

Страницы: 429-435.

Язык статьи: English.

In this paper, we study the problem of feature points description in the context of document analysis and template matching. Our study shows that specific training data is required for the task especially if we are to train a lightweight neural network that will be usable on devices with limited computational resources. In this paper, we construct and provide a dataset of photo and synthetically generated images and a method of training patches generation from it. We prove the effectiveness of this data by training a lightweight neural network and show how it performs in both general and documents patches matching. The training was done on the provided dataset in comparison with HPatches training dataset and for the testing, we solve HPatches testing framework tasks and template matching task on two publicly available datasets with various documents pictured on complex backgrounds: MIDV-500 and MIDV-2019.

Ключевые слова:
feature points description, metrics learning, training dataset.

This work was supported by the Russian Foundation for Basic Research (projects 18-29-26033 and 19-29-09064).

Sheshkus A, Chirvonaya A, Arlazarov VL. Tiny CNN for feature point description for document analysis: approach and dataset. Computer Optics 2022; 46(3): 429-435. DOI: 10.18287/2412-6179-CO-1016.


