SHEN Yu, YANG Qian, ZHANG Hongguo, WANG Lin
(School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)
Abstract: Aiming at the problems of image semantic content distortion and blurred foreground and background boundaries during the transfer process of convolutional neural image stylization, we propose a convolutional neural artistic stylization algorithm for suppressing image distortion. Firstly, the VGG-19 network model is used to extract the feature map from the input content image and style image and to reconstruct the content and style. Then the transfer of the input content image and style image to the output image is constrained in the local affine transformation of the color space. And the Laplacian matting matrix is constructed by combining the local affine of the input image RGB channel.For each output blocks, affine transformation maps the RGB value of the input image to the corresponding output and position, which realizes the constraint of semantic content and the control of spatial layout. Finally, the synthesized image is superimposed on the white noise image and updated iteratively with the back propagation algorithm to minimize the loss function to complete the image stylization. Experimental results show that the method can generate images with obvious foreground and background edges, clear texture, restrained semantic content-distortion, realized spatial constraint and color mapping of the transfer images, and made the stylized images visually satisfactory.
Key words: neural network; style transfer; deep learning; affine transformationReferences
[1]Cao J F. The research and implementation of image and video style transfer based on deep learning. Beijing: University of Chinese Academy of Sciences, 2017.
[2]Kyprianidis J E, Collomosse J, Wang T, et al. State of the “art”: a taxonomy of artistic stylization techniques for images and video. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(5): 866-885.
[3]Jing Y, Yang Y, Feng Z, et al. Neural style transfer: a review. IEEE Transactions on Visualization and Computer Graphics, 2020, 26(11): 3365-3385.
[4]Gatys L A, Ecker A S, Bethge M. A neural algorithm of artistic style. Journal of Vision, 2016, 16(12): 326.
[5]Gatys L A, Ecker A S, Bethge M. Image Style Transfer Using Convolutional Neural Networks. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016: 2414-2423.
[6]Gatys L A, Ecker A S, Bethge M. Preserving color in neural artistic style transfer. ar Xiv preprint arXiv: 1606.05897.2016
[7]Gatys L A, Ecker A S, Bethge M, et al. Controlling perceptual factors in neural style transfer. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 2017: 3730-3738.
[8]Gatys L A, Ecker A S, Bethge M. Texture synthesis using convolutional neural networks. ar Xiv preprint arXiv: 1505.07376.2015
[9]Li Y J, Chen F, Yang J M, et al. Universal style transfer via feature transforms. ar Xiv preprint arXiv: 1705.08086.2017.
[10]Ulyanov D, Lebedev V, Lempitsky V. Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 2017: 4105-4113
[11]Ulyanov D, Lebedev V, Vedaldi A, et al. Texture Networks: Feed-forward Synthesisof Textures and Stylized Images.ar Xiv preprint arXiv: 1603.03417.2016
[12]Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, 2017: 1510-1519.
[13]Mechrez R, Talmi I, Zelnik-Manor L. The contextual loss for image transformation with non-aligned data. In: Proceedings of European Conference on Computer Vision. Springer International Publishing. 2018: 800-815.
[14]Chuan L, Wand M. Precomputed real-time texture synthesis with markovian generative adversarial networks. In: Proceedings of European Conference on Computer Vision. Springer International Publishing, 2016: 702-716.
[15]Liu X C, Cheng M M, Lai Y K et al. Depth-aware neural style transfer. In: Proceedings of the Symposium on Non-Photorealistic Animation and Rendering. Los Angeles, California: ACM. 2017: 1-10.
[16]Johnson J, Alexandre A, Li F F. Perceptual losses for real-time style transfer and super-resolution.arXiv preprint arXiv: 1603.08155, 2016
[17]Chen Q, Li D, Tang C. KNN matting. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013, 35(9): 2175-2188.
[18]Werman M, Omer I. Color lines: image specific color representation.In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 2004: 946-953.
[19]Charles F.Van L. The ubiquitous Kronecker product. Journal of Computational & Applied Mathematics, 2000, 123(1-2): 85-100.
[20]Karen S, Andrew Z. Very deep convolutional networks for large-scale Image recognition. arXiv preprint arXiv: 1409.1556 [cs.CV],2015.
[21]Sheng L, Lin Z, Shao J, et al. Avatar-net: multi-scale zero-shot style transfer by feature decoration. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018: 8242-8250.
[22]Chen T Q, Schmidt M. Fast Patch-based style transfer of arbitrary style. ar Xiv preprint arXiv 1612.04337.2016
[23]Wang S, Rehman A, Wang Z, et al. Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Transactions on Image Processing, 2013, 22(4): 1418-1429.
[24]Al-Najjar Y. Comparison of image quality assessment: PSNR, HVS, SSIM, UIQI. International Journal of entific and Engineering Research, 2012, 3(3): 1-5.
[25]Wang L T, Hoover N E, Porter E H, et al. SSIM: a software levelized compiled-code simulator. In: Proceedings of 24th ACM/IEEE Design Automation Conference, Miami Beach, Florida, USA, 1987: 2-8.
[26]Zhou W, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
一种抑制图像扭曲的卷积神经艺术风格化算法
沈瑜, 杨倩, 张泓国, 王霖
(兰州交通大学 电子与信息工程学院, 甘肃 兰州 730070)
摘要: 针对卷积神经图像风格艺术化过程中出现的图像语义内容扭曲, 前后景边界模糊的问题, 我们提出了一种抑制图像扭曲的卷积神经艺术风格化算法。 首先用VGG-19网络模型对输入的内容图像和风格图像提取特征图并进行内容重建和风格重建。 然后把输入的内容图像和风格图像到输出图像的变换约束在色彩空间局部仿射变换中, 在输入图像RGB通道上构建Laplacian抠图矩阵, 对于每一个输出区块, 仿射变换将输入图像的RGB值映射到对应的输出及的位置上, 实现了语义内容的约束和空间布局的控制。 最后, 将合成的图像叠加至白噪声图像上, 并用反向传播算法迭代更新至损失函数最小, 完成图像的风格化。 实验结果表明, 该方法生成的图像前后景边缘明显、 纹理清楚, 抑制了语义内容扭曲, 实现了迁移图像的空间约束和颜色映射, 风格化图像视觉上令人满意。
关键词: 神经网络; 风格迁移; 深度学习; 仿射变换
引用格式:SHEN Yu, YANG Qian, ZHANG Hongguo, et al. A convolutional neural artistic stylization algorithm for suppressing image distortion. Journal of Measurement Science and Instrumentation, 2021, 12(3): 287-294. DOI: 10.3969/j.issn.1674-8042.2021.03.006
[full text view]