(49-6) 20 * << * >> * Russian * English * Content * All Issues

Instance segmentation of objects in images using deep learning and synthetic data
G.A. Algashev 1, E.V. Gorbunov 1, I.A. Kilbas 1, R.A. Paringer 1, A.V. Kupriyanov 1

Samara National Research University,
443086, Samara, Russia, Moskovskoye Shosse 34

 PDF, 2237 kB

DOI: 10.18287/2412-6179-CO-1656

Pages: 1037-1046.

Full text of article: Russian language.

Abstract:
The paper considers the problem of instance segmentation of objects in images using modern deep learning models and synthetic data. The main attention is paid to the study of the effectiveness of synthetic data created on the basis of 3D models for pre-training segmentation models. Such architectures as U-Net, DeepLabV3+, Mask R-CNN and YOLOv8 are considered. To improve the quality of synthetic data, various parameters of automatic data generation were used, including random positioning of objects, adding backgrounds, changing lighting, changing object texture, adding blur and adding obstacles. The experiments showed that each of these steps significantly contributes to the accuracy of the models, and their combination provides the best results (mAP 92.1%). The results confirm that the combined use of synthetic and real data allows bridging the gap between the synthetic and real environment. The best performance was achieved by the YOLOv8 model, which demonstrated high accuracy and processing speed. The obtained findings highlight the importance of carefully tuning the parameters of synthetic data generation to improve segmentation in real-world applications.

Keywords:
instance segmentation of objects, object segmentation, deep learning, convolutional neural networks, synthetic data, neural network models, computer vision, learning without manual labeling.

Citation:
Algashev GA, Gorbunov EV, Kilbas IA, Paringer RA, Kupriyanov AV. Instance segmentation of objects in images using deep learning and synthetic data. Computer Optics 2025; 49(6): 1037-1046. DOI: 10.18287/2412-6179-CO-1656.

Acknowledgements:
The research was carried out within the state assignment theme FSSS-2023-0006.

References:

  1. Turajlić E. Multilevel Thresholding Image Segmentation Based on Multi-swarm Particle Swarm optimization with a Dynamic Learning Strategy and Kapur’s entropy. 31st Telecommunications Forum (TELFOR) 2023. 1–4. DOI: 10.1109/TELFOR59449.2023.10372741.
  2. Iqbal E, Niaz A, Munir A, Choi KN. Hybrid Active Contour Model for Segmentation of Synthetic and Real Images. 2021 Int Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) 2021: 1-2. DOI: 10.1109/ISPACS51563.2021.9651047.
  3. Zhang L, Zhang H, Wang J, Yang Q. GrabCut: Interactive foreground extraction using graph cuts. ACM Trans Graph 2004; 23(3): 309-314. DOI: 10.1145/1015706.1015720.
  4. Puri D. COCO Dataset stuff segmentation challenge. 2019 5th Int Conf on Computing, Communication, Control and Automation (ICCUBEA) 2019: 1-5. DOI: 10.1109/ICCUBEA47591.2019.9129255.
  5. Everingham D, Van Gool L, Williams C, Winn J, Zisserman A. The Pascal Visual Object Classes (VOC) challenge. Int J Comput Vis 2010; 88(2): 303-338. DOI: 10.1007/s11263-009-0275-4.
  6. Cordts M, Omran M, Ramos S. The cityscapes dataset for semantic urban scene understanding. 2016 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2016: 3213-3223. DOI: 10.1109/CVPR.2016.350.
  7. Bovshik PP. Analysis of frameworks for neural networks [In Russian]. Science, technology and education. 2021; 3: 20-23.
  8. Long J, Shelhamer E, Darrell T. Fully convolutional net-works for semantic segmentation. 2015 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2015: 3431-3440. DOI: 10.1109/CVPR.2015.7298965.
  9. Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In Book: Navab N, Hornegger J, Wells WM, Frangi AF, eds. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III. Dordrecht: Springer International Publishing Switzerland; 2015: 234-241. DOI: 10.1007/978-3-319-24574-4_28.
  10. Chen L-C, Papandreou G, Kokkinos I, Dollár P, Zhang LY. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Machine Intell 2018; 40(4): 834-848. DOI: 10.1109/TPAMI.2017.2699184.
  11. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. 2017 IEEE Int Conf on Computer Vision (ICCV) 2017: 2961-2969. DOI: 10.1109/ICCV.2017.322.
  12. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. 2016 IEEE Conf on Computer Vision and Pattern Recognition (CVPR) 2016: 779-788. DOI: 10.1109/CVPR.2016.91.
  13. Konushin AS, Faizov BV, Shakhuro VI. Road images augmentation with synthetic traffic signs using neural networks. Computer Optics 2021; 45(5): 736-748. DOI: 10.18287/2412-6179-CO-859.
  14. Imbusch B, Schwarz M, Behnke S. Synthetic-to-Real domain adaptation using contrastive unpaired translation. arXiv Preprint. 2022. Source: <https://arxiv.org/abs/2203.09454>. DOI:    10.48550/arXiv.2203.09454.
  15. Makarov SN, Verhoglyad AG, Stupak MF, Ovchinnikov DA, Oberemok JA. Mathematical simulation of a 3D scanner for controlling the mirror system of the Millimetron Observatory. Computer Optics 2021; 45(4): 541-550. DOI: 10.18287/2412-6179-CO-833.
  16. Bochkovskiy A, Wang C-Y, Liao H-YM. YOLOv4: Opti-mal speed and accuracy of object detection. arXiv Pre-print. 2020. Source: <https://arxiv.org/abs/2004.10934>. DOI: 10.48550/arXiv.2004.10934.
  17. Lin T-Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context. In Book: Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision – ECCV 2014. 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V. Cham: Springer International Publishing Switzerland; 2014: 740-755. DOI: 10.1007/978-3-319-10602-1_48.
  18. Kingma DP, Ba J. Adam: A method for stochastic optimization. Int Conf on Learning Representations (ICLR) 2015.
  19. Krasnov DI. Attention modules in convolutional neural networks for small object recognition. Computer Optics 2024; 48(6): 963-968. DOI: 10.18287/2412-6179-CO-1468.

© 2009, IPSI RAS
151, Molodogvardeiskaya str., Samara, 443001, Russia; E-mail: journal@computeroptics.ru ; Tel: +7 (846) 242-41-24 (Executive secretary), +7 (846) 332-56-22 (Issuing editor), Fax: +7 (846) 332-56-20