MAAUNet: Exploration of U-shaped encoding and decoding structure for semantic segmentation of medical image

SHAO Shuo1,2， GE Hongwei1,2

（1. Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China； 2. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China）

Abstract： In view of the problems of multi-scale changes of segmentation targets, noise interference, rough segmentation results and slow training process faced by medical image semantic segmentation, a multi-scale residual aggregation U-shaped attention network structure of MAAUNet (MultiRes aggregation attention UNet) is proposed based on MultiResUNet. Firstly, aggregate connection is introduced from the original feature aggregation at the same level. Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet, and the problem of semantic gaps is further solved that may exist between skip connections. Secondly, after the multi-scale convolution module, a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map. Finally, the original convolution block is improved. The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features. Residual connections are retained and the convolution block is turned into a multi-channel convolution block. The model is made to extract multi-scale spatial features. The experimental results show that MAAUNet has strong competitiveness in challenging datasets, and shows good segmentation performance and stability in dealing with multi-scale input and noise interference.

Key words： U-shaped attention network structure of MAAUNet； convolutional neural network； encoding-decoding structure； attention mechanism； medical image； semantic segmentation

References

［1］CODELLA N C F, GUTMAN D, CELEBI M E, et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)//2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Apr.4-7, 2018, Washington, DC, USA. New York: IEEE, 2018: 168-172.

［2］LECUN Y, BOTTOU L, BENGIO Y, et al.Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324.

［3］HESAMIAN M H, JIA W, HE X, et al. Deep learning techniques for medical image segmentation: achievements and challenges. Journal of Digital Imaging, 2019, 32(4): 582-596.

［4］LONG J, SHELHAMER E, DARREL T. Fully convolutional networks for semantic segmentation// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 7-12, 2015, Boston, MA, USA. New York: IEEE, 2015: 3431-3440.

［5］BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12), 2481-2495.

［6］RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, Oct.5-9, 2015, Munich, Germany. Berlin: Springer, 2015, 9351: 234-241.

［7］MILLETARI F, NAVAB N, AHMADI S A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation//2016 Fourth International Conference on 3D Vision (3DV), Oct. 25-28, 2016, Stanford, CA, USA. New York: IEEE, 2016: 565-571.

［8］HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 770-778.

［9］TOMAR N K, JHA D, ALI S, et al. DDANet: Dual decoder attention network for automatic polyp segmentation.Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, 12668: 307-314.

［10］ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 6230-6239.

［11］HUANG G, LIU Z, LAURENS V, et al. Densely connected convolutional networks//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 2261-2269.

［12］IBTEHAZ N, RAHMAN M S. MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Networks, 2020, 121: 74-87.

［13］ZHOU Z, SIDDIQUEE M, TAJBAKHSH N, et al. U-Net++: Redesigning skip connections to exploit multiscale features in image segmentation.IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867.

［14］LOU A, GUAN S, LOEW M. DC-UNet: Rethinking the U-Net architecture with dual channel efficient CNN for medical images segmentation//SPIE Medical Imaging, Feb. 15-19, 2021, San Diego, US. Washington: SPIE, 2021: 758-768.

［15］JHA D, RIEGLER M A, JOHANSEN D, et al. Double U-Net: A deep convolutional neural network for medical image segmentation//2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), Jul. 28-30, 2020, Rochester, MN, USA. New York: IEEE, 2020: 558-564.

［16］CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. Computer Science, 2014, arXiv: 1412.7062.

［17］CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab:Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.

［18］CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation. Computer Vision and Pattern Recognition, 2017, arXiv: 1706.05587.

［19］CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation//European Conference on Computer Vision, Sep. 8-14, 2018, Munich, Germany. Berlin: Springer, 2018, 801-818.

［20］SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions//2015 IEEE conference on computer vision and pattern recognition (CVPR), Jun. 7-12, 2015, Boston, MA, USA. New York: IEEE, 2015: 1-9.

［21］HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.

［22］GUO C, SZEMENYEI M, YI Y, et al. SA-UNet: Spatial attention U-Net for retinal vessel segmentation//2020 25th International Conference on Pattern Recognition (ICPR), Jan. 10-15, 2021, Milan, Italy. New York： IEEE, 2021: 1236-1242.

［23］WOO S, PARK J, LEE J, et al. CBAM: Convolutional block attention module.Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 11211: 3-19.

［24］COELHO L P, SHARIFF A, MURPHY R F. Nuclear segmentation in microscope cell images: a hand-segmented dataset and comparison of algorithms//2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Jun. 28-Jul. 1, 2009, Boston, MA, USA. New York: IEEE, 2009: 518-521.

［25］BERNAL J, SANCHEZ F J, FERNANDEZ-ESPARRACH G, et al. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, 2015, 43: 99-111.

MAAUNet: 医学图像语义分割U型编解码结构探索

邵硕1,2，葛洪伟1,2

（1. 江南大学江苏省模式识别与计算智能实验室，江苏无锡 214122； 2. 江南大学人工智能与计算机学院，江苏无锡， 214122）

摘要:针对医学图像语义分割面临的分割目标多尺度变化、噪声干扰、分割结果粗糙、训练过程缓慢的问题，基于UNet和MultiResUNet提出了一种多尺度残差带有聚合连接的U型注意力网络结构MAAUNet (MultiRes aggregation attention UNet)。首先，引入了聚合连接。由原来同一级的特征聚合重新设计跳跃连接，在解码器子网处聚合不同语义尺度的特征，进一步解决跳跃连接间可能存在的语义鸿沟问题。其次，在多尺度卷积模块之后加入了卷积块注意力机制模块。在通道和空间两个注意力方向上特征聚焦并集成，以自适应优化中间特征图。最后，对原有的多尺度卷积块做出改进。用串联卷积结构拓展卷积通道，相互补充信息，提取更丰富的空间特征，保留残差连接，使原卷积块变为多通道卷积块，从而使模型可提取多尺度空间特征。实验结果表明， MAAUNet在具有挑战性的数据集上具有很强的竞争力，在应对多尺度输入、噪声干扰的情况时表现出了良好的分割性能和稳定性。

关键词:U型注意力网络结构MAAUNet；卷积神经网络；编解码结构；注意力机制；医学图像；语义分割

引用格式：SHAO Shuo, GE Hongwei. MAAUNet: Exploration of U-shaped encoding and decoding structure for semantic segmentation of medical image. Journal of Measurement Science and Instrumentation， 2022， 13（4）： 418-429. DOI： 10.3969／j.issn.1674-8042.2022.04.005

[full text view]

此页面上的内容需要较新版本的 Adobe Flash Player。

MAAUNet: Exploration of U-shaped encoding and decoding structure for semantic segmentation of medical image