Research on Convolution Neural Network for Object Detection and Recognition
DOI:
https://doi.org/10.54097/LNATo9LzKeywords:
Convolution Neural Network, Object Detection and Recognition, Faster RR-CNN, PASCAL VOC 2012Abstract
The state-of-the-art object detection networks for natural images have recently demonstrated impressive performances. However, the complexity of object’s shape and orientation exposes the limited capacity of these networks for strip-like rotated assembled object detection which are common in any dataset as well as real im- ages. In this project, I embrace this observation and introduce the Faster Rotated Region-based Convolutional Neural Network (Faster RR-CNN), which can learn and accurately extract features of rotated regions and arbitrary-oriented objects precisely. In comparison with the classic Faster RCNN, Faster RR-CNN has three important new components including a skew non-maximum suppression, a rotated bounding box regression model and a rotated region of interest (RRoI) pooling layer. I conduct experiments using the PASCAL VOC 2012 dataset, demonstrat-ing the potential ability of this novel network in detecting oriented objects.
Downloads
References
X. Chen and A. Gupta. An implementation of faster rcnn with study for region sampling. arXiv preprint arXiv:1702.02138, 2017.
R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
K. He, G. Gkioxari, P. Dolla´r, and R. Girshick. Mask r-cnn. arXiv preprint arXiv:1703.06870, 2017.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
G. Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten. Densely connected convolutional networks. arXiv preprint arXiv:1608.06993, 2016.
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
Y. Li, K. He, J. Sun, et al. R-fcn: Object detection via region-based fully convolutional net- works. In Advances in Neural Information Processing Systems, pages 379–387, 2016.
T.-Y. Lin, P. Dolla´r, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. arXiv preprint arXiv:1612.03144, 2016.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37. Springer, 2016.
Z. Liu, J. Hu, L. Weng, and Y. Yang. Rotated region based cnn for ship detection.
J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, Y. Zheng, and X. Xue. Arbitrary-oriented scene text detection via rotation proposals. arXiv preprint arXiv:1703.01086, 2017.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779–788, 2016.
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recog- nition. arXiv preprint arXiv:1409.1556, 2014.
D. Stutz. Understanding convolutional neural networks. InSeminar Report, Fakulta¨t fu¨r Math- ematik, Informatik und Naturwissenschaften Lehr-und Forschungsgebiet Informatik VIII Com- puter Vision, 2014.
J. R. Uijlings, K. E. Van De Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. International journal of computer vision, 104(2):154–171, 2013.
C. L. Zitnick and P. Dolla´r. Edge boxes: Locating object proposals from edges. In European Conference on Computer Vision, pages 391–405. Springer, 2014.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Frontiers in Computing and Intelligent Systems

This work is licensed under a Creative Commons Attribution 4.0 International License.