深度融合网结合条件随机场的遥感图像语义分割
Semantic segmentation of remote sensing image based on deep fusion networks and conditional random field
- 2020年24卷第3期 页码:254-264
纸质出版日期: 2020-03-07
DOI: 10.11834/jrs.20208298
扫 描 看 全 文
浏览全部资源
扫码关注微信
纸质出版日期: 2020-03-07 ,
扫 描 看 全 文
肖春姣,李宇,张洪群,陈俊.2020.深度融合网结合条件随机场的遥感图像语义分割.遥感学报,24(3): 254-264
Xiao C J, Li Y, Zhang H Q and Chen J. 2020. Semantic segmentation of remote sensing image based on deep fusion networks and conditional random field. Journal of Remote Sensing(Chinese), 24(3): 254-264
为了充分利用遥感图像中丰富的细节信息和上下文信息,提高图像语义分割精度,提出一种深度融合网结合条件随机场模型的遥感图像语义分割方法。方法在全卷积神经网络框架中增加反卷积融合结构,搭建深度融合DFN(Deep Fusion Networks)网络,通过深层网络自动获取多尺度特征,避免人工设计和选择特征,提高模型的泛化能力;同时借助反卷积融合结构,利用多尺度信息,将浅层细节信息和深层语义信息相融合,提高模型的处理精度。由全连接条件随机场引入空间上下文信息,更好地定位边界,得到最终的语义分割结果。在遥感图像数据集上的实验结果显示:(1)随着不同尺度细节信息的融入,结果的边缘轮廓越精确、接近标签图像;(2)增加了空间上下文信息后,语义分割结果边缘更细化、准确,精度更高。实验表明,该方法可以有效提高遥感图像语义分割的精度,改善结果的过平滑现象。
Image semantic segmentation refers to segmenting an image into several groups of pixel regions with different specific semantic meanings and identifying the categories of each region. In recent years
the common semantic segmentation methods that are based on Convolutional Neural Networks (CNN) have realised the pixel-to-pixel image semantic segmentation. They can avoid the problems of artificial design and selection of features in traditional image semantic segmentation methods. As a result of the pooling operation and lack of context information
the detailed information of images is neglected
the precision of the final image semantic segmentation result is low and the segmentation edge is inaccurate. Therefore
this study proposes a semantic segmentation method for remote sensing image on the basis of Deep Fusion Networks (DFN) combined with a conditional random field model.
The method initially builds a DFN model in a Fully Convolutional Network (FCN) framework with a deconvolutional fusion structure. On the one hand
the multiscale features can be extracted through the deep networks
which can avoid the artificial design and selection of features to improve the generalisation ability of the model. On the other hand
the multiscale information is used in the model with the help of the deconvolutional fusion structure. The processing accuracy of the model is also improved by fusing the shallow detail information and deep semantic information. Fundamentally
the fully connected conditional random field is introduced to supplement the spatial context information towards precisely locating the boundary and obtaining final semantic segmentation results.
From this study
we can draw the following
(1)With the increase in the depth of the fusion layer
detailed information becomes abundant
the semantic segmentation results become refined and the edge contour becomes close to the label image;
(2) The fully connected conditional random field model synthesises the global and local information of the remote sensing image and further improves the efficiency and accuracy of the final semantic segmentation results.
遥感图像语义分割全卷积网络条件随机场融合结构反卷积
remote sensing image semantic segmentationfully convolutional networksconditional random fieldfusion structuredeconvolution
Arbeláez P, Hariharan B, Gu C H, Gupta S, Bourdev L and Malik J. 2012. Semantic segmentation using regions and parts//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE: 3378-3385 [DOI: 10.1109/CVPR.2012.6248077]
Badrinarayanan V, Kendall A and Cipolla R. 2017. SegNet: a deep convolutional encoder-decoder architecture for scene segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12): 2481-2495 [DOI: 10.1109/TPAMI.2016.2644615http://dx.doi.org/10.1109/TPAMI.2016.2644615]
Chang L, Deng X M, Zhou M Q, Wu Z K, Yuan Y, Yang S and Wang H A. 2016. Convolutional neural networks in image understanding. Acta Automatica Sinica, 42(9): 1300-1312
常亮, 邓小明, 周明全, 武仲科, 袁野, 杨硕, 王宏安. 2016. 图像理解中的卷积神经网络. 自动化学报, 42(9): 1300-1312) [DOI: 10.16383/j.aas.2016.c150800http://dx.doi.org/10.16383/j.aas.2016.c150800]
Chen L C, Papandreou G, Kokkinos L, Murphy K and Yuille A L. 2018. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834-848 [DOI: 10.1109/TPAMI.2017.2699184http://dx.doi.org/10.1109/TPAMI.2017.2699184]
Cheng G, Han J W and Lu X Q. 2017. Remote sensing image scene classification: benchmark and state of the art. Proceedings of the IEEE, 105(10): 1865-1883 [DOI: 10.1109/JPROC.2017.2675998http://dx.doi.org/10.1109/JPROC.2017.2675998]
Krähenbühl P and Koltun V. 2012. Efficient inference in fully connected CRFs with Gaussian edge potentials. Advances in Neural Information Processing Systems 24: 109-117
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada Curran Associates Inc: 1097-1105
Kumar S and Hebert M. 2003. Discriminative random fields: a discriminative framework for contextual interaction in classification//Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE: 1150-1157 [DOI: 10.1109/ICCV.2003.1238478http://dx.doi.org/10.1109/ICCV.2003.1238478]
Lafferty J D, McCallum A and Pereira F C N. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data//Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc: 282-289
Li S Z. 2009. Markov Random Field Modeling in Image Analysis. 3rd ed. London: Springer
Li Z C, Ma J W, Zhang R and Li L W. 2011. Classifying hyperspectral data using support vector machine conditional random field. Geomatics and Information Science of Wuhan University, 36(3): 306-310
李祖传, 马建文, 张睿, 李利伟. 2011. 利用SVM-CRF进行高光谱遥感数据分类. 武汉大学学报(信息科学版), 36(3): 306-310) [DOI: 10.13203/j.whugis2011.03.009http://dx.doi.org/10.13203/j.whugis2011.03.009]
Liu W, Zhou T, Yuan H and Zhao H. 2017. 3D spatial layout understanding from image based on multiple CRFs. Acta Electronica Sinica, 45(2): 328-336
刘威, 周婷, 袁淮, 赵宏. 2017. 基于多条件随机场模型的图像3D空间布局理解. 电子学报, 45(2): 328-336) [DOI: 10.3969/j.issn.0372-2112.2017.02.010http://dx.doi.org/10.3969/j.issn.0372-2112.2017.02.010]
Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE: 3431-3440 [DOI: 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965]
Lü Y B, Zhao J W and Cao F L. 2017. Image denoising algorithm based on composite convolutional neural network. Pattern Recognition and Artificial Intelligence, 30(2): 97-105
吕永标, 赵建伟, 曹飞龙. 2017. 基于复合卷积神经网络的图像去噪算法. 模式识别与人工智能, 30(2): 97-105 [DOI: 10.16451/j.cnki.issn1003-6059.201702001http://dx.doi.org/10.16451/j.cnki.issn1003-6059.201702001]
Montoya-Zegarra J A, Wegner J D, Ladický L and Schindler K. 2015. Semantic segmentation of aerial images in urban areas with class-specific higher-order cliques. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, II-3/W4: 127-133 [DOI: 10.5194/isprsannals-II-3-W4-127-2015http://dx.doi.org/10.5194/isprsannals-II-3-W4-127-2015
Shotton J, Johnson M and Cipolla R. 2008. Semantic texton forests for image categorization and segmentation//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE: 1-8 [DOI: 10.1109/CVPR.2008.4587503http://dx.doi.org/10.1109/CVPR.2008.4587503]
Smeulders A W M, Worring M, Santini S, Gupta A and Jain R. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12): 1349-1380 [DOI: 10.1109/34.895972http://dx.doi.org/10.1109/34.895972]
Sun W W and Wang R S. 2018. Fully convolutional networks for semantic segmentation of very high resolution remotely sensed images combined with DSM. IEEE Geoscience and Remote Sensing Letters, 15(3): 474-478 [DOI: 10.1109/LGRS.2018.2795531http://dx.doi.org/10.1109/LGRS.2018.2795531]
Tang Z C, Zhang K J, Li C, Sun S Q, Huang Q and Zhang S Y. 2017. Motor imagery classification based on deep convolutional neural network and its application in exoskeleton controlled by EGG. Chinese Journal of Computers, 40(6): 1367-1378
唐智川, 张克俊, 李超, 孙守迁, 黄琦, 张三元. 2017. 基于深度卷积神经网络的运动想象分类及其在脑控外骨骼中的应用. 计算机学报, 40(6): 1367-1378 [DOI: 10.11897/SP.J.1016.2017.01367http://dx.doi.org/10.11897/SP.J.1016.2017.01367]
Volpi M and Ferrari V. 2015. Semantic segmentation of urban scenes by learning local class interactions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, MA, USA: IEEE: 1-9 [DOI: 10.1109/CVPRW.2015.7301377http://dx.doi.org/10.1109/CVPRW.2015.7301377]
Wei X, Guo Y J, Gao X, Yan M L and Sun X. 2017. A new semantic segmentation model for remote sensing images//Proceedings of 2017 IEEE International Geoscience and Remote Sensing Symposium. Fort Worth, TX, USA: IEEE: 1776-1779 [DOI: 10.1109/IGARSS.2017.8127319http://dx.doi.org/10.1109/IGARSS.2017.8127319]
Wei Y C and Zhao Y. 2016. A review on image semantic segmentation based on DCNN. Journal of Beijing Jiaotong University, 40(4): 82-91
魏云超, 赵耀. 2016. 基于DCNN的图像语义分割综述. 北京交通大学学报, 40(4): 82-91 [DOI: 10.11860/j.issn.1673-0291.2016.04.013http://dx.doi.org/10.11860/j.issn.1673-0291.2016.04.013]
Xiao J S, Liu E Y, Zhu L and Lei J F. 2017. Improved image super-resolution algorithm based on convolutional neural network. Acta Optica Sinica, 37(3): 0318011
肖进胜, 刘恩雨, 朱力, 雷俊锋. 2017. 改进的基于卷积神经网络的图像超分辨率算法. 光学学报, 37(3): 0318011 [DOI: 10.3788/AOS201737.0318011http://dx.doi.org/10.3788/AOS201737.0318011]
Yang W, Triggs B, Dai D X and Xia G S. 2010. Scene Segmentation with low-dimensional semantic representations and conditional random fields. EURASIP Journal on Advances in Signal Processing,2010: 196036 [DOI: 10.1155/2010/196036http://dx.doi.org/10.1155/2010/196036]
Yang Y and Newsam S. 2010. Bag-of-visual-words and spatial extensions for land-use classification//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. San Jose, California: ACM: 270-279 [DOI: 10.1145/1869790.1869829http://dx.doi.org/10.1145/1869790.1869829]
Zhang H Q, Liu X Y, Yang S and Li Y. 2017. Retrieval of remote sensing images based on semisupervised deep learning. Journal of Remote Sensing, 21(3): 406-414
张洪群, 刘雪莹, 杨森, 李宇. 2017. 深度学习的半监督遥感图像检索. 遥感学报, 21(3): 406-414 [DOI: 10.11834/jrs.20176105http://dx.doi.org/10.11834/jrs.20176105]
Zhang K, Hei B Q, Zhou Z and Li S Y. 2018. CNN with coefficient of variation-based dimensionality reduction for hyperspectral remote sensing images classification. Journal of Remote Sensing, 22(1): 87-96
张康, 黑保琴, 周壮, 李盛阳. 2018. 变异系数降维的CNN高光谱遥感图像分类. 遥感学报, 22(1): 87-96 [DOI: 10.11834/jrs.20187075http://dx.doi.org/10.11834/jrs.20187075]
Zhang R, Li G Y, Li M L and Wang L. 2018. Fusion of images and point clouds for the semantic segmentation of large-scale 3D scenes based on deep learning. ISPRS Journal of Photogrammetry and Remote Sensing, 143: 85-96 [DOI: 10.1016/j.isprsjprs.2018.04.022http://dx.doi.org/10.1016/j.isprsjprs.2018.04.022]
Zhao J, Zhong Y F and Zhang L P. 2015. Detail-preserving smoothing classifier based on conditional random fields for high spatial resolution remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 53(5): 2440-2452 [DOI: 10.1109/TGRS.2014.2360100http://dx.doi.org/10.1109/TGRS.2014.2360100]
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z Z, Du D L, Huang C and Torr P H S. 2015. Conditional random fields as recurrent neural networks//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1529-1537 [DOI: 10.1109/ICCV.2015.179http://dx.doi.org/10.1109/ICCV.2015.179]
相关作者
相关机构