Improved target detection algorithm based on Faster-RCNN

BAI Chenshuai1， WU Kaijun1， WANG Dicong1,2， HUANG Tao1， TAO Xiaomiao1

（1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China; 2. College of Intelligence and Computing, Tianjin University, Tianjin 300350, China）

Abstract： Asymmetric convolution block network is introduced into the Faster-RCNN network model, and it is defined as improved target detection algorithm based on Faster-RCNN. In this algorithm, the convolution kernel of 3×3 in the network model is replaced by the asymmetric convolution block of 1×3+3×1+3×3. Firstly, the residual network ResNet is used as the backbone of the algorithm to extract the feature map of the image. The feature map passes through the convolution kernel block of 1×3+3×1+3×3 and then passes through two convolution kernels of 1×1. Secondly, the regional proposal network (RPN) is used to obtain the suggestion box of shared feature layer, and the suggestion box is mapped to the last feature map of convolution, and the anchor box of different sizes are unified by region of interest (RoI). Finally, the detection classification probability (Softmax loss) and detection border regression (Smooth L1 loss) are used for training. PASCAL_VOC data set is used. The results of mean average precision (mAP) show that the mAP value is increased by 0.38% compared with the original Faster-RCNN algorithm, the mAP value is increased by 2.68% compared with the RetinaNet algorithm, and the mAP value is increased by 3.41% compared with the YOLOv4 algorithm.

Key words： Faster-RCNN; target detection algorithm； asymmetric convolution block； regional proposal network； regional pooling layer

References

［1］LIU Z K, HU J G, WENG L B, et al. Rotated region based CNN for ship detection//2017 IEEE International Conference on Image Processing (ICIP), Feb.22, 2017, Beijing, China. New York: IEEE, 2017: 900-904.
［2］GIRSHICK R. Fast R-CNN//IEEE International Conference on Computer Vision (ICCV), Dec.13-16, 2015, Santiago, Chile. New York: IEEE, 2015: 1440-1448.
［3］ REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017: 1137-1149.
［4］ JIANG H Z, LEARNED-MILLER E. Face detection with the faster R-CNN//12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), May 30-Jun. 03, 2017, Washington, USA. New York: IEEE, 2017: 650-657.
［5］ ROH M, LEE J. Refining faster R-CNN for accurate object detection//2017 Fifteenth IAPR International Conference on Machine Vision Applications (MVA), May 08-12, 2017, Nagoya, Japan. New York: IEEE, 2017: 514-517.
［6］ ZHAO X T, LI W, ZHANG Y F, et al. A faster RCNN-based pedestrian detection system//2016 IEEE 84th Vehicular Technology Conference (VTC-Fall), Sep. 18-21, 2016, Montreal, Canada. New York: IEEE, 2016: 1-5.
［7］ EGGERT C, BREHM S, WINSCHEL A, et al. A closer look: small object detection in faster R-CNN// 2017 IEEE International Conference on Multimedia and Expo (ICME), Jul. 10-14, 2017, Hong Kong, China. New York: IEEE, 2017: 421-426
［8］ SALVADOR A, GIRO-NIETO X, MARQUES F, et al. Faster R-CNN features for instance search//IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun. 26-Jul. 01, 2016, Las Vegas, USA. New York: IEEE, 2016: 9-16.
［9］ LIN Z, JI K F, LENG X, et al. Squeeze and excitation rank faster R-CNN for ship detection in sar images. IEEE Geoscience and Remote Sensing Letters, 2019, 16(5): 751-755.
［10］ WEN L L, SUN M, WU M. Ocean target recognition model based on attention mechanism and Fast R-CNN deep learning. Journal of Dalian Ocean University, 2021, 36(5): 859-865.
［11］ JOHNSON J W. Adapting mask-RCNN for automatic nucleus segmentation. Computer Science, arXiv:1805.00500 ［cs.CV］.
［12］ LU X, LI B Y, YUE Y X, et al. Grid R-CNN//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 16-20, 2019, Long Beach, USA. New York: IEEE, 2019: 7363-7372.
［13］ LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector//European Conference on Computer Vision, Oct. 11-14, 2016, Amsterdam, Netherlands. Berlin: Springer, 2016: 21-37.
［14］REDMON J, FARHADI A. YOLOv3: An incremental improvement. Computer Science, arXiv:1804.02767 ［cs.CV］.
［15］ BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection. Computer Science, arXiv: 2004.10934 ［cs.CV］.
［16］ DING X H , GUO Y C , DING G G , et al. ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks//IEEE/CVF International Conference on Computer Vision (ICCV), Oct. 27-Nov.2, 2019, Seoul, Korea. New York: IEEE, 2019: 1911-1920.
［17］ HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 26-Jul. 01, 2016, Las Vegas, USA. New York: IEEE, 2016: 770-778.
［18］TARG S S, ALMEIDA D, LYMAN K. Resnet in resnet: generalizing residual architectures. Computer Science, arXiv: 1603.08029 ［cs.LG］.
［19］ ZHANG J F, HAN B, WYNTER L, et al. Towards robust resnet: a small step but a giant leap. Computer Science, arXiv: 1902.10887 ［cs.CV］.
［20］ LI B, YAN J J, WU W, et al. High performance visual tracking with siamese region proposal network//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 18-22, 2018, Salt Lake City, Utah. New York: IEEE, 2018: 8971-8980.
［21］ VU T, JANG H J, PHAM T X, et al. Cascade RPN: delving into high-quality region proposal network with adaptive convolution. Computer Science, arXiv:1909.06720 ［cs.CV］.
［22］ DAI J F, LI Y, HE K M, et al. R-FCN: object detection via region-based fully convolutional networks. Computer Science, arXiv: 1605.06409 ［cs.CV］.
［23］ CHENG J, LU J G, BAI Y Q, et al. High-resolution remote sensing image object detection algorithm combining RPN network and SSD algorithm. Science of detection, 2021, 46(4): 75-82.

基于Faster-RCNN改进的目标检测算法

白晨帅1，邬开俊1，王迪聪1,2，黄涛1，陶小苗1

（1. 兰州交通大学电子与信息工程学院，甘肃兰州 730070； 2. 天津大学智能与计算学部，天津 300350）

摘要：以Faster-RCNN目标检测算法为基础，用（1×3+3×1+3×3)非对称卷积块替代Faster-RCNN网络模型的3×3卷积核，提出一种基于Faster-RCNN的改进目标检测算法。首先，将残差网络ResNet作为算法骨干，用于提取图像的特征图（Feature map），将Feature map先通过（1×3+3×1+3×3）的卷积核块之后经过两个1×1的卷积核。其次，利用区域建议网络（Regional proposal network, RPN）获得共享特征层的建议框，把建议框映射到卷积的最后一层Feature map上，通过感兴趣区域池化层（Region of interest, RoI）将不同尺寸的锚框进行归一化。最后，利用探测分类概率（Softmax loss）和探测边框回归（Smooth L1 loss）进行训练。本文使用的是PASCAL_VOC数据集，平均查确率（Mean average precision, mAP）结果表明，相比于原始Faster-RCNN算法， mAP值提高了0.38%，相比于RetinaNet算法， mAP值提高了2.68%，相比于YOLOv4算法， mAP值提高了3.41%。

关键词：Faster-RCNN；目标检测算法；非对称卷积块；区域建议网络；区域池化层

引用格式：BAI Chenshuai， WU Kaijun， WANG Dicong, et al． Improved target detection algorithm based on Faster-RCNN． Journal of Measurement Science and Instrumentation， 2023， 14（4）： 485-492. DOI： 10．3969／j．issn．1674-8042．2023．04．011

[full text view]

此页面上的内容需要较新版本的 Adobe Flash Player。

Improved target detection algorithm based on Faster-RCNN