WANG Xiangjun1,2, LIU Linghao1,2, NI Yubo1, WANG Lin1
(1. State Key Laboratory of Precision Measuring Technology and Instruments, Tianjin University, Tianjin 300072, China;
2. MOEMS Education Ministry Key Laboratory, Tianjin University, Tianjin 300072, China)
Abstract: For traffic object detection in foggy environment based on convolutional neural network (CNN), data sets in fog-free environment are generally used to train the network directly. As a result, the network cannot learn the object characteristics in the foggy environment in the training set, and the detection effect is not good. To improve the traffic object detection in foggy environment, we propose a method of generating foggy images on fog-free images from the perspective of data set construction. First, taking the KITTI objection detection data set as an original fog-free image, we generate the depth image of the original image by using improved Monodepth unsupervised depth estimation method. Then, a geometric prior depth template is constructed to fuse the image entropy taken as weight with the depth image. After that, a foggy image is acquired from the depth image based on the atmospheric scattering model. Finally, we take two typical object-detection frameworks, that is, the two-stage object-detection Fster region-based convolutional neural network (Faster-RCNN) and the one-stage object-detection network YOLOv4, to train the original data set, the foggy data set and the mixed data set, respectively. According to the test results on RESIDE-RTTS data set in the outdoor natural foggy environment, the model under the training on the mixed data set shows the best effect. The mean average precision (mAP) values are increased by 5.6% and by 5.0% under the YOLOv4 model and the Faster-RCNN network, respectively. It is proved that the proposed method can effectively improve object identification ability foggy environment.
Key words: traffic object detection; foggy images generation; unsupervised depth estimation; YOLOv4 model; Faster region-based convolutional neural network (Faster-RCNN)
References
[1]Gao J P, Zhang X G. Analyses on driver’s visual effect of fog on expressway. Journal of Wuhan University of Technology, 2014, 36(9): 68-72.
[2]Huang Jing, Jiang Wen, Xiao Chang-shi, et al.Single image haze temoval algorithm based on deep learning. Journal of Chinese Computer Systems, 2018, 39(8): 1882-1887.
[3]Zhang N, Lin Z, Zai X C. Towards simulating foggy and hazy images and evaluating their authenticity. In: Proceedings of International Conference on Neural Information, Guangzhou, China, 2017: 141-145.
[4]Silberman, Nathan, et al. Indoor segmentation and support inference from rgbd images. In: Proceedings of European Conference on Computer Vision, Berlin, Germern, 2012: 1345-1349.
[5]Sakaridis C, Dai D, Van Gool L. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 2018, 126(9): 973-92.
[6]Geiger A, Lenz P, Stiller C, et al. Vision meets robotics: the KITTI dataset. The International Journal of Robotics Research, 2013, 32(11): 1231-7.
[7]Narasimhan S G, Nayar S K. Contrast restoration of weather degraded images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(6): 713-24.
[8]Godard C, Mac Aodha O, Brostow G J. Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 270-279.
[9]Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention, Munich, Germany, 2015: 234-241.
[10]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[11]Hore A, Ziou D. Image quality metrics: PSNR vs. SSIM. In: Proceedigns of 2010 20th International Conference on Pattern Recognition, 2010: 2366-2369.
[12]Li Y F. Research on objective evaluation method of dehazed image quality based on synthetic hazy images. Shaanxi: Xidian University, 2017.
[13]Bochkovskiy A, Wang C Y, Liao H Y. YOLOv4: optimal speed and accuracy of object detection. [2020-10-28]. https://arxiv.org/abs/2004.109341.
[14]Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39: 1137-1149.
基于无监督深度估计的带雾图像生成方法
王向军1,2, 刘灵皓1,2, 倪育博1, 王霖1
(1. 天津大学 精密测试技术及仪器国家重点实验室, 天津 300072;2. 天津大学 微光机电系统技术教育部重点实验室, 天津 300072)
摘要: 基于卷积神经网络的雾霾环境视觉目标检测, 通常直接利用无雾条件下清晰的数据集对网络进行训练, 网络无法通过训练集获取雾霾图像下目标的特征权重配置, 因而检测效果不佳。 为了提高雾霾环境下目标检测效果, 从数据集构建角度入手, 提出了一种在无雾图像上生成带雾图像的方法。 首先以KITTI-objection数据集作为原始无雾图像, 利用改进的Monodepth无监督深度估计方法生成原始图像的深度图像。 然后构造几何先验深度模板, 以图像熵为权值与深度图进行融合, 并根据大气散射模型, 由深度图像得到雾图像。 最后, 采用基于二阶的Faster-RCNN和基于一阶的YOLOv4两种典型的目标检测架构, 对原始数据集、雾数据集、混合数据集进行训练, 并对室外自然雾霾场景数据集RESIDE-OTS进行检测。 实验结果表明, 使用混合数据集训练下的模型检测效果最好, 在YOLOv4模型下mAP值提升了5.6%, 在Faster R-CNN网络下mAP值提升了5.0%, 从而有效提升了雾霾环境下卷积神经网络的目标识别能力。
关键词: 交通目标识别; 雾霾图像生成; 无监督深度估计; YOLOv4模型; Faster R-CNN
引用格式:WANG Xiangjun, LIU Linghao, NI Yubo, et al. A method to generate foggy optical images based on unsupervised depth estimation. Journal of Measurement Science and Instrumentation, 2021, 12(1): 44-52. DOI: 10.3969/j.issn.1674-8042.2021.01.006
[full text view]