完全残差连接与多尺度特征融合遥感图像分割
Image segmentation models of remote sensing using full residual connection and multiscale feature fusion
- 2020年24卷第9期 页码:1120-1133
收稿:2018-10-12,
纸质出版:2020-09-07
DOI: 10.11834/jrs.20208365
移动端阅览
收稿:2018-10-12,
纸质出版:2020-09-07
移动端阅览
遥感图像数据规模大,光照、遮挡等情况复杂,目标密集、尺度不一以及缺乏大量带标注图像用于训练深度网络等特点对遥感图像分割的完整性和正确性造成了更大的挑战。针对深度卷积网络中因多次卷积造成分辨率显著下降,像素类别预测精度降低的问题,本文在深度卷积编码—解码网络的基础上设计了一个采用完全残差连接和多尺度特征融合的端到端遥感图像分割模型。该模型具有两方面优点:首先,长距离和短距离的完全残差连接既简化了深层网络的训练,又为本层末端融入了原始输入信息,增强了特征融合。其次,不同尺度和方式的特征融合使网络能够提取丰富的上下文信息,应对目标尺度变化,提升分割性能。本文通过对ISPRS Vaihingen和Road Detection数据集做数据扩充并进行实验,分别从平均IOU、平均F1值两方面对模型进行评价。通过与目前先进的模型以及文献中的研究成果进行比较,结果表明本文所提模型优于对比模型,在两个数据集上的平均IOU分别达到了85%和84%,平均F1值分别达到了92%和93%,能够有效提高遥感图像目标分割的完整性和正确性。
Many characteristics of remote sensing images
such as large scale
complex illumination and occlusion
dense
multiple scales
various posture targets
and the lack of a large number of labeled images for training depth networks
pose great challenges to the integrity and accuracy of remote sensing image segmentation. In deep convolutional networks for segmentation
resolution is significantly reduced by multiple pooling
thereby reducing the prediction accuracy of pixel class.
On the basis of the deep convolutional coding-decoding network
an end-to-end remote sensing image segmentation model with full residual connection and multiscale feature fusion is proposed in this paper. First
the features in the encoder are merged into the corresponding layers of the decoder
and the residual unit is added to the corresponding convolution layer. The full residual connection constructed by the operation enables the model as a whole to effectively enhance feature fusion and be easier to train. Second
the feature pyramid module
which aggregates multiscale context information
is used on the high-level feature map of the fifth stage of the encoder before feature fusion
thus enabling the model to effectively deal with multiscale changes of the target and improve the segmentation performance.
Experiments were conducted on the ISPRS Vaihingen and Road Detection datasets. The proposed model was evaluated from the two aspects of average IOU and average F1-score. A comparison between the current advanced models and the results in the literature shows that the proposed model is better than the comparison models. The average IOU on the two datasets is 85% and 84%
and the average F1 value is 92% and 93%
respectively.
An end-to-end remote sensing image segmentation model with full residual connection and multiscale feature fusion is proposed in this paper. The proposed model achieves better results than the current advanced image semantic segmentation model on the two datasets. The segmentation targets are more complete
continuous
and have fewer misclassifications and leakages. The proposed model also achieves better results than the comparative model in road segmentation of remote sensing images from different sources
thereby further verifying the robustness of the model.
Audebert N , Le Saux B and Lefèvrey S . 2017 . Fusion of heterogeneous data in convolutional networks for urban semantic labeling // Proceedings of 2017 Joint Urban Remote Sensing Event . Dubai, United Arab Emirates : IEEE : 1 - 4 [ DOI: 10.1109/JURSE.2017.7924566 http://dx.doi.org/10.1109/JURSE.2017.7924566 ]
Badrinarayanan V , Kendall , A and Cipolla R . 2017 . SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation . IEEE Transactions on Pattern Analysis and Machine Intelligence , 39 ( 12 ): 2481 - 2495 . DOI: 10.1109/TPAMI.2016.2644615 http://dx.doi.org/10.1109/TPAMI.2016.2644615 .
Chen L C , Papandreou G , Kokkinos I , Murphy K and Yuille A L . 2018 . DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs . IEEE Transactions on Pattern Analysis and Machine Intelligence , 40 ( 4 ): 834 - 848 [ DOI: 10.1109/TPAMI.2017.2699184 http://dx.doi.org/10.1109/TPAMI.2017.2699184 ]
Chen L C , Papandreou G , Schroff F and Adam H . 2017 . Rethinking atrous convolution for semantic image segmentation . arXiv preprint arXiv : 1706 . 05587
Cheng G L , Wang Y , Xu S B , Wang H Z , Xiang S M and Pan C H . 2017 . Automatic road detection and centerline extraction via cascaded end-to-end convolutional neural network . IEEE Transactions on Geoscience and Remote Sensing , 55 ( 6 ): 3322 - 3337 [ DOI: 10.1109/TGRS.2017.2669341 http://dx.doi.org/10.1109/TGRS.2017.2669341 ]
He K M , Zhang X Y , Ren S Q and Sun J . 2015 . Delving deep into rectifiers: surpassing human-level performance on imageNet classification // Proceedings of the IEEE International Conference on Computer Vision . Santiago, Chile : IEEE : 1026 - 1034 [ DOI: 10.1109/ICCV.2015.123 http://dx.doi.org/10.1109/ICCV.2015.123 ]
He K M , Zhang X Y , Ren S Q and Sun J . 2016 . Deep residual learning for image recognition // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA : IEEE : 770 - 778 [ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
ISPRS Vaihingen dataset . 2018 . http://www2.isprs.org/commissions/comm2/wg4/vaihingen-2d-semantic-labeling-contest.html http://www2.isprs.org/commissions/comm2/wg4/vaihingen-2d-semantic-labeling-contest.html
Lin G S , Milan A , Shen C H and Reid I . 2017 . RefineNet: multi-path refinement networks for high-resolution semantic segmentation // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI, USA : IEEE : 5168 - 5177 [ DOI: 10.1109/CVPR.2017.549 http://dx.doi.org/10.1109/CVPR.2017.549 ]
Long J , Shelhamer E and Darrell T . 2015 . Fully convolutional networks for semantic segmentation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston, MA, USA : IEEE : 3431 - 3440 [ DOI: 10.1109/CVPR.2015.7298965 http://dx.doi.org/10.1109/CVPR.2015.7298965 ]
Maggiori E , Tarabalka Y , Charpiat G and Alliez P . 2017 . High-resolution aerial image labeling with convolutional neural networks . IEEE Transactions on Geoscience and Remote Sensing , 55 ( 12 ): 7092 - 7103 [ DOI: 10.1109/TGRS.2017.2740362 http://dx.doi.org/10.1109/TGRS.2017.2740362 ]
Noh H , Hong S and Han B . 2015 . Learning deconvolution network for semantic segmentation // Proceedings of the IEEE International Conference on Computer Vision . Santiago, Chile : IEEE : 1520 - 1528 [ DOI: 10.1109/ICCV.2015.178 http://dx.doi.org/10.1109/ICCV.2015.178 ]
Paisitkriangkrai S , Sherrah J , Janney P and Van-Den Hengel A . 2015 . Effective semantic pixel labelling with convolutional networks and conditional random fields // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops . Boston, MA, USA : IEEE : 36 - 43 [ DOI: 10.1109/CVPRW.2015.7301381 http://dx.doi.org/10.1109/CVPRW.2015.7301381 ]
Panboonyuen T , Jitkajornwanich K , Lawawirojwong S , Srestasathiern P and Vateekul P . 2017 . Road segmentation of remotely-sensed images using deep convolutional neural networks with landscape metrics and conditional random fields . Remote Sensing , 9 ( 7 ): 680 [ DOI: /10.3390/rs9070680 http://dx.doi.org//10.3390/rs9070680 ]
Ronneberger O , Fischer P and Brox T . 2015 . U-Net: convolutional networks for biomedical image segmentation // Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention . Munich, Germany : Springer : 234 - 241 [ DOI: 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ]
Sherrah J . 2016 . Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery . arXiv preprint arXiv : 1606 .02585.
Simonyan K and Zisserman A . 2015 . Very deep convolutional networks for large-scale image recognition . International Conference on Learning Representations . 1 - 14 .
Volpi M and Tuia D . 2017 . Dense semantic labeling of subdecimeter resolution images with convolutional neural networks . IEEE Transactions on Geoscience and Remote Sensing , 55 ( 2 ): 881 - 893 [ DOI: 10.1109/TGRS.2016.2616585 http://dx.doi.org/10.1109/TGRS.2016.2616585 ]
Wang P Q , Chen P F , Yuan Y , Liu D , Huang Z H , Hou X D and Cottrell G . 2018 . Understanding convolution for semantic segmentation // Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision . Lake Tahoe, NV, USA : IEEE : 1451 - 1460 [ DOI: 10.1109/WACV.2018.00163 http://dx.doi.org/10.1109/WACV.2018.00163 ]
Wu Y Q , Ji Y , Shen Y and Zhang Y F . 2012 . Marine spill oil SAR image segmentation based on Tsallis entropy and improved Chan Vese model . Journal of Remote Sensing , 16 ( 4 ): 678 - 690
吴一全 , 吉玚 , 沈毅 , 张宇飞 . 2012 . Tsallis熵和改进CV模型的海面溢油SAR图像分割 . 遥感学报 , 16 ( 4 ): 678-690 [ DOI: 10.11834/jrs.20121192 http://dx.doi.org/10.11834/jrs.20121192 ]
Yang X Z , Liu C J , Wu K W and Lang W H . 2014 . SAR sea ice image segmentation using SRRG-MRF . Journal of Remote Sensing , 18 ( 6 ): 1247 - 1257
杨学志 , 刘灿俊 , 吴克伟 , 郎文辉 . 2014 . SRRG-MRF的SAR海冰图像分割算法 . 遥感学报 , 18 ( 6 ): 1247-1257 [ DOI: 10.11834/jrs.20143266 http://dx.doi.org/10.11834/jrs.20143266 ]
Yu B , Meng J M , Zhang X and Ji Y G . 2013 . Segmentation method for agglomerative hierarchical-based sea ice types using polarimetric SAR data . Journal of Remote Sensing , 17 ( 4 ): 887 - 904
于波 , 孟俊敏 , 张晰 , 纪永刚 . 2013 . 结合凝聚层次聚类的极化SAR海冰分割 . 遥感学报 , 17 ( 4 ): 887-904 [ DOI: 10.11834/jrs.20132091 http://dx.doi.org/10.11834/jrs.20132091 ]
Yu F and Koltun V . 2016 . Multi-scale context aggregation by dilated convolutions . International Conference on Learning Representations . 1 - 13 .
Zeiler M D and Fergus R . 2014 . Visualizing and understanding convolutional networks // Proceedings of the 13th European Conference on Computer Vision . Zurich, Switzerland : Springer : 818 - 833 [ DOI: 10.1007/978-3-319-10590-1_53 http://dx.doi.org/10.1007/978-3-319-10590-1_53 ]
Zhang Z X , Liu Q J and Wang Y H . 2018 . Road extraction by deep residual U-Net . IEEE Geoscience and Remote Sensing Letters , 15 ( 5 ): 749 - 753 [ DOI: 10.1109/LGRS.2018.2802944 http://dx.doi.org/10.1109/LGRS.2018.2802944 ]
Zhao H S , Shi J P , Qi X J , Wang X G and Jia J Y . 2017 . Pyramid scene parsing network // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI, USA : IEEE : 6230 - 6239 [ DOI: 10.1109/CVPR.2017.660 http://dx.doi.org/10.1109/CVPR.2017.660 ]
Zheng W , Kang G W , Chen W F and Li X W . 2008 . Unsupervised segmentation of remote sensing images based on fuzzy Markov random field . Journal of Remote Sensing , 12 ( 2 ): 246 - 252
郑玮 , 康戈文 , 陈武凡 , 李小文 . 2008 . 基于模糊马尔可夫随机场的无监督遥感图像分割算法 . 遥感学报 , 12 ( 2 ): 246-252 [ DOI: 10.11834/jrs.20080232 http://dx.doi.org/10.11834/jrs.20080232 ]
Zhong Z Y , Jin L W and Xie Z C . 2015 . High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps // Proceedings of 2015 13th International Conference on Document Analysis and Recognition . Tunis, Tunisia : IEEE : 846 - 850 [ DOI: 10.1109/ICDAR.2015.7333881 http://dx.doi.org/10.1109/ICDAR.2015.7333881 ]
Zhou M F and Wang X L . 2018 . Object detection models of remote sensing images using deep neural networks with weakly supervised training method . Scientia Sinica Informationis , 48 ( 8 ): 1022 - 1034
周明非 , 汪西莉 . 2018 . 弱监督深层神经网络遥感图像目标检测模型 . 中国科学 : 信息科学 , 48 ( 8 ): 1022-1034 [ DOI: 10.1360/N112017-00208 http://dx.doi.org/10.1360/N112017-00208 ]
相关文章
相关作者
相关机构
京公网安备11010802024621
