Reducing background false positives for face detection in surveillance feed
A.E. Sergeev, A.S. Konushin, V.S. Konushin

National Research University Higher School of Economics, Moscow, Russia,
Video Analysis Technologies LLC, Moscow, Russia

Full text of article: Russian language.


This paper addresses a problem of false positive detection filtering in surveillance video streams. We propose two methods. The first one is based on automatic hard negative mining from a video stream, which is then used for fine-tuning of the baseline detector. The second one is the detector output filtering by analyzing the frequency of detection of visually similar samples. We demonstrate the proposed methods on cascade-based detectors, but they can be applied to any detector that can be trained in a reasonable amount of time. Experimental results show that the proposed methods improve both the precision and recall rate, as well as reducing the computational time by 47%.

detectors, pattern recognition, image analysis, machine vision algorithms.

Sergeev AE, Konushin AS, Konushin VS. Reducing background false positives for face detection in surveillance feeds. Computer Optics 2016; 40(6): 958-967. DOI: 10.18287/2412-6179-2016-40-6-946-958-967.


  1. Verma RC. Schmid C, Mikolaqczyk K. Face detection and tracking in a video by propagating detection probabilities. IEEE Transactions on Pattern Analysis and Machine Intelligence 2003; 25(10): 1215-1228. DOI: 10.1109/TPAMI.2003.1233896.
  2. Park D, Zitnick CL, Ramanan D, Dollár P. Exploring weak stabilization for motion feature extraction. CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2013: 2882-2889. DOI: 10.1109/CVPR.2013.371.
  3. Walk S, Majer N, Schindler K, Schiele B. New features and insights for pedestrian detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), June 13-18, 2010, San Francisco, California, USA 2010: 1030-1037. DOI: 10.1109/CVPR.2010.5540102.
  4. Kolarow A, Schenk K, Eisenbach M, Dose M, Brauckmann M, Debes K, Gross H-M. APFel: The intelligent video analysis and surveillance system for assisting human operators. 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 2013: 195-201. DOI: 10.1109/AVSS.2013.6636639.
  5. Viola P, Jones MJ. Robust real-time face detection. International Journal of Computer Vision 2004; 57(2): 137-154. DOI: 10.1023/B:VISI.0000013087.49260.fb.
  6. Mathias M, Benenson R, Pedersoli M, Van Gool L. Face detection without bells and whistles. 13th European Conference on Computer Vision (ECCV 2014), Zürich, Switzerland, September 6-12, 2014: 720-735. DOI: 10.1007/978-3-319-10593-2_47.
  7. Dollár P, Tu Z, Perona P, Belongie S. Integral channel features. BMVC 2009: 91.1-91.11. DOI: 10.5244/C.23.91.
  8. Felzenswalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. CVPR 2008: 1-8. DOI: 10.1109/CVPR.2008.4587597.
  9. Zhu X, Ramanan D. Face detection, pose estimation and landmark localization in the wild. CVPR 2012: 2879-2886. DOI: 10.1109/CVPR.2012.6248014.
  10. Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection. Advances in Neural Information Processing Systems 2013: 2553-2561.
  11. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014: 580-587. DOI: 10.1109/CVPR.2014.81.
  12. Appel R, Fuchs T, Dollár P, Perona P. Quickly boosting decision trees-pruning underachieving features early. ICML 2013; 28: 594-602.
  13. Dollár P, Appel R, Belongie S, Perona P. Fast feature pyramids for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 2014; 36(8): 1532-1545. DOI: 10.1109/TPAMI.2014.2300479.
  14. Dollár P, Belongie S, Perona P. The fastest pedestrian detector in the west. BMVC 2010: 68.1-68.11. DOI: 10.5244/C.24.68.
  15. Dollár P, Appel R, Kienzle W. Crosstalk cascades for frame-rate pedestrian detection. ECCV'12 2012; II: 645-659. DOI: 10.1007/978-3-642-33709-3_46.
  16. Jain V, Miller E. Online domain adaptation of a pre-trained cascade of classifiers. CVPR 2011: 577-584. DOI: 10.1109/CVPR.2011.5995317.
  17. Wang M, Wang X. Automatic adaptation of a generic pedestrian detector to a specific traffic scene. CVPR 2011: 3401-3408. DOI: 10.1109/CVPR.2011.5995698.
  18. Barnich O, Van Droogenbroeck M. ViBe: A universal background subtraction algorithm for video sequences. IEEE Transactions on Image Processing 2011; 20(6): 1709-1724. DOI: 10.1109/TIP.2010.2101613.
  19. Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 1997; 55(1): 119-139. DOI: 10.1006/jcss.1997.1504.
  20. Bourdev L, Brandt J. Robust object detection via soft cascade. CVPR 2005; 2: 236-243. DOI: 10.1109/CVPR.2005.310.
  21. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The Pascal visual object clasees (VOC) challenge. International Journal of Computer Vision 2010; 88(2): 303-338. DOI: 10.1007/s11263-009-0275-4.

© 2009, IPSI RAS
Institution of Russian Academy of Sciences, Image Processing Systems Institute of RAS, Russia, 443001, Samara, Molodogvardeyskaya Street 151; E-mail:; Phones: +7 (846) 332-56-22, Fax: +7 (846) 332-56-20