深度神经网络条件随机场高分辨率遥感图像建筑物分割
Building segmentation in high-resolution remote sensing image through deep neural network and conditional random fields
- 2019年23卷第6期 页码:1194-1208
纸质出版日期: 2019-11 ,
录用日期: 2018-6-15
DOI: 10.11834/jrs.20198141
扫 描 看 全 文
浏览全部资源
扫码关注微信
纸质出版日期: 2019-11 ,
录用日期: 2018-6-15
扫 描 看 全 文
王宇, 杨艺, 王宝山, 王田, 卜旭辉, 王传云. 2019. 深度神经网络条件随机场高分辨率遥感图像建筑物分割. 遥感学报, 23(6): 1194–1208
Wang Y, Yang Y, Wang B S, Wang T, Bu X H and Wang C Y. 2019. Building segmentation in high-resolution remote sensing image through deep neural network and conditional random fields. Journal of Remote Sensing, 23(6): 1194–1208
高分辨率遥感图像建筑物分割的实质是构建一个输入图像到分割结果之间的高维强非线性映射模型。然而,建筑物可能遍布整幅遥感图像,则在语义分割过程中,当前像素点可能与非邻域的像素点存在直接关系。为了更加精确地逼近建筑物分割的真实映射模型,克服道路、建筑物错层和阴影的影响,提高分割精度,本文以深度残差神经网络为基础,构建Encoder-Decoder的深度学习架构,自动提取建筑物的特征,学习建立高维强非线性分割模型;同时,通过条件随机场的成对势函数调节当前像素点与其他像素点之间的关联关系,从而构成全连接条件随机场对Encoder-Decoder的分割结果进行调节,提升分割精度。在全连接条件随机场的计算过程中,采用循环神经网络的运行机制来完成均值场的计算,这将条件随机场与深度神经网络有机融合,实现了Encoder-Decoder和全连接条件随机场参数的同步训练。实验结果表明,本文采用的深度神经网络条件随机场方法能有效克服道路、建筑物错层和阴影的影响,提升高分辨率遥感图像中建筑物的分割精度;同时,在一定范围内对多分辨率遥感图像具有较好的泛化能力。
The core of building segmentation in high-resolution remote sensing image is to establish the mapping from an image feature space to a segmentation result with high dimension and strong nonlinearity. In a high-resolution remote sensing image
a building frequently emerges at any location in the entire image
thereby indicating that non-neighborhood pixels may be related to the current semantic segmentation pixel. The segmentation precision and generalization are significantly improved by adopting a Deep Neural Network (DNN) to extract the features and learn the nonlinear mapping in image segmentation. However
the non-neighborhood feature cannot be directly extracted by the DNN. This study presents an encoder–decoder deep learning architecture with ResNet and Conditional Random Field (CRF) for building semantic segmentation in a high-resolution remote sensing image to obtain high segmentation precision and reduce the obstacles from roads
staggered floors
and shadows. In the DNN
ResNet is used to establish the encoder for automatically extracting the building features
in which ResNet avoids the problems of vanishing and exploding gradient and accelerates the convergence of DNN weights. Before each convolution operation
batch normalization is adopted to normalize the sampling data and reduce the training difficulty of the DNN. Then
transposed convolution is applied to establish the decoder for reconstructing the image while segmenting the buildings. At the end of the DNN
the CRF is used to adjust the raw segmentation produced by the decoder. The value of a unary potential function in the CRF is given by the raw result of the decoder
and the pairwise potential function denotes the feature of pixel pairs in the entire image
which constructs a fully connected CRF (FCCRF). Considering that the calculation of FCCRF is considerable
a mean field algorithm is used to approximate the pairwise potential function value. Thus
convolution is used to obtain the pairwise potential function value
and a high-dimensional Gaussian filter is applied to implement the convolution operation. The mean field algorithm is implemented through an RNN mechanism. Thus
FCCRF becomes a part of the DNN
and the parameters of the CRF are trained with the encoder and decoder simultaneously. Experiments are conducted to validate the effectiveness of the proposed methodology. The remote sensing image dataset is Inria Aerial Image Labeling Dataset. A total of 4500 samples with 1000×1000×3 pixels are found in each sample
in which their resolution is 0.3 m. The typical kinds of building
such as building with order
single building with complicated roof
and building without order
are segmented through VGG
ResNet
and the proposed methodology (denoted as ResNetCRF)
correspondingly. The results show that ResNetCRF overcomes the interruption of roads in which their color features are similar to the building and effectively reduces the disturbance of staggered floors and shadows. Thus
ResNetCRF obtains the optimal segmentation precision. The multi-resolution experiment demonstrates that ResNetCRF has a strong generalization under a limited range of resolution change. Accurate mapping of building segmentation is established to reduce the disturbance of roads
shadows
and staggered floors by introducing CRFs in the encoder–decoder based on ResNet to segment the building in a high-resolution remote sensing image. In the future work
we will investigate the reduction of FCCRF calculation
overcome the missing segmentation of small buildings
and reduce the segmentation errors of a building whose color feature is similar to the background without a noticeable edge.
高分辨率遥感图像深度神经网络条件随机场建筑物分割
high resolution remote sensing imagedeep neural networkconditional random fieldsbuilding segmentation
Adams A, Baek J and Davis M A. 2010. Fast high‐dimensional filtering using the permutohedral lattice. Computer Graphics Forum, 29(2): 753–762
Alshehhi R, Marpu P R, Woon W L and Mura M D. 2017. Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS Journal of Photogrammetry and Remote Sensing, 130: 139–149
Audebert N, Le Saux B and Lefèvre S. 2018. Beyond RGB: very high resolution urban remote sensing with multimodal deep networks. ISPRS Journal of Photogrammetry and Remote Sensing, 140: 20–32
Badrinarayanan V, Kendall A and Cipolla R. 2017. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12): 2481–2495
Bittner K, Cui S Y and Reinartz P. 2017. Building extraction from remote sensing data using fully convolutional Networks//International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Hannover, Germany: ISPR: 481–486
Chatzis S P, Kosmopoulos D I and Doliotis P. 2013. A conditional random field-based model for joint sequence segmentation and classification. Pattern Recognition, 46(6): 1569–1578
陈杰, 邓敏, 肖鹏峰, 杨敏华, 梅小明, 刘慧敏. 2011. 利用小波变换的高分辨率多光谱遥感图像多尺度分水岭分割. 遥感学报, 15(5): 908–926
Chen J, Deng M, Xiao P F, Yang M H, Mei X M and Liu H M. 2011. Multi-scale watershed segmentation of high-resolution multi-spectral remote sensing image using wavelet transform. Journal of Remote Sensing, 15(5): 908–926
Chen L C, Papandreou G, Kokkinos I, Murphy K and Yuille A L. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFs. arVix preprint arXiv: 1412.7062(2014)
Chen L C, Papandreou G, Kokkinos I, Murphy K and Yuille A L. 2017. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834–848
Dumoulin V and Visin F. 2016. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv: 1603.07285(2016)
Glorot X, Bordes A and Bengio Y. 2011. Deep sparse rectifier neural networks//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, FL, USA: [s.n.]
He K M, Zhang X Y, Ren S Q and Sun J. 2015. Deep residual learning for image recognition. arXiv preprint arXiv: 1512.03385(2015)
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv: 1502.03167(2015)
Jiao L C, Liang M M, Chen H, Yang S Y, Liu H Y and Cao X H. 2017. Deep fully convolutional network-based spatial distribution prediction for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 55(10): 5585–5599
Krähenbühl P and Koltun V. 2012. Efficient inference in fully connected CRFs with gaussian edge potentials. arXiv preprint arXiv: 1210.5644(2012)
李航. 2012. 统计学习方法. 北京: 清华大学出版社
Li H. 2012. Statistical Learning Method. Beijing: Tsinghua University Press
Lin H N, Shi Z W and Zou Z X. 2017. Fully convolutional network with task partitioning for inshore ship detection in optical remote sensing images. IEEE Geoscience and Remote Sensing Letters, 14(10): 1665–1669
Liu Y, Carbonell J, Weigele P and Gopalakrishnan V. 2005. Segmentation conditional random fields (SCRFs): a new approach for protein fold recognition//Proceedings of the 9th Annual International Conference on Research in Computational Molecular Biology. Cambridge, MA, USA: Springer: 408–422
Liu Y S, Piramanayagam S, Monteiro S T and Saber E. 2017. Dense semantic labeling of very-high-resolution aerial imagery and LiDAR with fully-convolutional neural networks and higher-order CRFs//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI, USA: IEEE: 1561–1570
Noh H, Hong S and Han B. 2015. Learning deconvolution network for semantic segmentation//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 1520–1528
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer
Shelhamer E, Long J and Darrell T. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4): 640–651
沈家煊. 2004. 人工智能中的" 联结主义”和语法理论. 外国语(3): 2–10
Shen J X. 2004. Connectionism in AI and grammatical theories. Journal of Foreign Languages(3): 2–10
Simonyan K and Zisserman A. 2014. Visual geometry group[EB/OL].http://www.robots.ox.ac.uk/~vgg/research/very_deep/http://www.robots.ox.ac.uk/~vgg/research/very_deep/ (2014)
王玉, 李玉, 赵泉华. 2018. 基于区域的多尺度全色遥感图像分割. 控制与决策, 33(3): 535–541
Wang Y, Li Y and Zhao Q H. 2018. A region-based multiscale segmentation of panchromatic remote sensing image. Control and Decision, 33(3): 535–541
王宇, 王宝山, 王田, 杨艺. 2018. 面向遥感图像水域分割的图像熵主动轮廓模型. 光学精密工程, 26(3): 698–707
Wang Y, Wang B S, Wang T and Yang Y. 2018. Image entropy active contour models towards water area segmentation in remote sensing image. Optics and Precision Engineering, 26(3): 698–707
Wang Y Y, Wang C and Zhang H. 2017. Integrating H-A-α with fully convolutional networks for fully PolSAR classification//Proceedings of 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP). Shanghai, China: IEEE: 1-4
Xu L L, Shafiee M J, Wong A and Clausi D A. 2016. Fully connected continuous conditional random field with stochastic cliques for dark-spot detection in SAR imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(7): 2882–2890
张建龙, 王斌. 2017. DSSRM级联分割的SAR图像变化检测. 遥感学报, 21(4): 614–621
Zhang J L and Wang B. 2017. SAR image change detection method of DSSRM based on cascade segmentation. Journal of Remote Sensing, 21(4): 614–621
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z Z, Du D L, Huang C and Torr P H S. 2015. Conditional random fields as recurrent neural networks//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 1529–1537
Zhong Y F, Han X B and Zhang L P. 2018. Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 138: 281–294
周东国, 高潮, 郭永彩. 2014. 一种参数自适应的简化PCNN图像分割方法. 自动化学报, 40(6): 1191–1197
Zhou D G, Gao C and Guo Y C. 2014. Adaptive simplified PCNN parameter setting for image segmentation. Acta Automatica Sinica, 40(6): 1191–1197
相关作者
相关机构