完全残差连接与多尺度特征融合遥感图像分割
Image segmentation models of remote sensing using full residual connection and multiscale feature fusion
- 2020年24卷第9期 页码:1120-1133
纸质出版日期: 2020-09-07
DOI: 10.11834/jrs.20208365
扫 描 看 全 文
浏览全部资源
扫码关注微信
纸质出版日期: 2020-09-07 ,
扫 描 看 全 文
张小娟,汪西莉.2020.完全残差连接与多尺度特征融合遥感图像分割.遥感学报,24(9): 1120-1133
ZHANG Xiaojuan,WANG Xili. 2020. Image segmentation models of remote sensing using full residual connection and multiscale feature fusion. Journal of Remote Sensing(Chinese),24(9): 1120-1133[DOI:10.11834/jrs.20208365]
遥感图像数据规模大,光照、遮挡等情况复杂,目标密集、尺度不一以及缺乏大量带标注图像用于训练深度网络等特点对遥感图像分割的完整性和正确性造成了更大的挑战。针对深度卷积网络中因多次卷积造成分辨率显著下降,像素类别预测精度降低的问题,本文在深度卷积编码—解码网络的基础上设计了一个采用完全残差连接和多尺度特征融合的端到端遥感图像分割模型。该模型具有两方面优点:首先,长距离和短距离的完全残差连接既简化了深层网络的训练,又为本层末端融入了原始输入信息,增强了特征融合。其次,不同尺度和方式的特征融合使网络能够提取丰富的上下文信息,应对目标尺度变化,提升分割性能。本文通过对ISPRS Vaihingen和Road Detection数据集做数据扩充并进行实验,分别从平均IOU、平均F1值两方面对模型进行评价。通过与目前先进的模型以及文献中的研究成果进行比较,结果表明本文所提模型优于对比模型,在两个数据集上的平均IOU分别达到了85%和84%,平均F1值分别达到了92%和93%,能够有效提高遥感图像目标分割的完整性和正确性。
Many characteristics of remote sensing images
such as large scale
complex illumination and occlusion
dense
multiple scales
various posture targets
and the lack of a large number of labeled images for training depth networks
pose great challenges to the integrity and accuracy of remote sensing image segmentation. In deep convolutional networks for segmentation
resolution is significantly reduced by multiple pooling
thereby reducing the prediction accuracy of pixel class.
On the basis of the deep convolutional coding-decoding network
an end-to-end remote sensing image segmentation model with full residual connection and multiscale feature fusion is proposed in this paper. First
the features in the encoder are merged into the corresponding layers of the decoder
and the residual unit is added to the corresponding convolution layer. The full residual connection constructed by the operation enables the model as a whole to effectively enhance feature fusion and be easier to train. Second
the feature pyramid module
which aggregates multiscale context information
is used on the high-level feature map of the fifth stage of the encoder before feature fusion
thus enabling the model to effectively deal with multiscale changes of the target and improve the segmentation performance.
Experiments were conducted on the ISPRS Vaihingen and Road Detection datasets. The proposed model was evaluated from the two aspects of average IOU and average F1-score. A comparison between the current advanced models and the results in the literature shows that the proposed model is better than the comparison models. The average IOU on the two datasets is 85% and 84%
and the average F1 value is 92% and 93%
respectively.
An end-to-end remote sensing image segmentation model with full residual connection and multiscale feature fusion is proposed in this paper. The proposed model achieves better results than the current advanced image semantic segmentation model on the two datasets. The segmentation targets are more complete
continuous
and have fewer misclassifications and leakages. The proposed model also achieves better results than the comparative model in road segmentation of remote sensing images from different sources
thereby further verifying the robustness of the model.
遥感图像分割,深度卷积神经网络,完全残差连接,多尺度特征融合,ISPRS Vaihingen数据集,RoadDetection数据集
remote sensing image segmentationdeep convolutional neural networkcomplete residual connectionmulti-scale feature fusionISPRS Vaihingen datasetsroad detection datasets
Audebert N, Le Saux B and Lefèvrey S. 2017. Fusion of heterogeneous data in convolutional networks for urban semantic labeling//Proceedings of 2017 Joint Urban Remote Sensing Event. Dubai, United Arab Emirates: IEEE: 1-4 [DOI: 10.1109/JURSE.2017.7924566http://dx.doi.org/10.1109/JURSE.2017.7924566]
Badrinarayanan V, Kendall, A and Cipolla R. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615http://dx.doi.org/10.1109/TPAMI.2016.2644615.
Chen L C, Papandreou G, Kokkinos I, Murphy K and Yuille A L. 2018. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834-848 [DOI: 10.1109/TPAMI.2017.2699184http://dx.doi.org/10.1109/TPAMI.2017.2699184]
Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Cheng G L, Wang Y, Xu S B, Wang H Z, Xiang S M and Pan C H. 2017. Automatic road detection and centerline extraction via cascaded end-to-end convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing, 55(6): 3322-3337 [DOI: 10.1109/TGRS.2017.2669341http://dx.doi.org/10.1109/TGRS.2017.2669341]
He K M, Zhang X Y, Ren S Q and Sun J. 2015. Delving deep into rectifiers: surpassing human-level performance on imageNet classification//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1026-1034 [DOI: 10.1109/ICCV.2015.123http://dx.doi.org/10.1109/ICCV.2015.123]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
ISPRS Vaihingen dataset. 2018. http://www2.isprs.org/commissions/comm2/wg4/vaihingen-2d-semantic-labeling-contest.htmlhttp://www2.isprs.org/commissions/comm2/wg4/vaihingen-2d-semantic-labeling-contest.html
Lin G S, Milan A, Shen C H and Reid I. 2017. RefineNet: multi-path refinement networks for high-resolution semantic segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 5168-5177 [DOI: 10.1109/CVPR.2017.549http://dx.doi.org/10.1109/CVPR.2017.549]
Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE: 3431-3440 [DOI: 10.1109/CVPR.2015.7298965http://dx.doi.org/10.1109/CVPR.2015.7298965]
Maggiori E, Tarabalka Y, Charpiat G and Alliez P. 2017. High-resolution aerial image labeling with convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(12): 7092-7103 [DOI: 10.1109/TGRS.2017.2740362http://dx.doi.org/10.1109/TGRS.2017.2740362]
Noh H, Hong S and Han B. 2015. Learning deconvolution network for semantic segmentation//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 1520-1528 [DOI: 10.1109/ICCV.2015.178http://dx.doi.org/10.1109/ICCV.2015.178]
Paisitkriangkrai S, Sherrah J, Janney P and Van-Den Hengel A. 2015. Effective semantic pixel labelling with convolutional networks and conditional random fields//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, MA, USA: IEEE: 36-43 [DOI: 10.1109/CVPRW.2015.7301381http://dx.doi.org/10.1109/CVPRW.2015.7301381]
Panboonyuen T, Jitkajornwanich K, Lawawirojwong S, Srestasathiern P and Vateekul P. 2017. Road segmentation of remotely-sensed images using deep convolutional neural networks with landscape metrics and conditional random fields. Remote Sensing, 9(7): 680 [DOI: /10.3390/rs9070680http://dx.doi.org//10.3390/rs9070680]
Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]
Sherrah J. 2016. Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv preprint arXiv:1606.02585.
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations. 1-14.
Volpi M and Tuia D. 2017. Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(2): 881-893 [DOI: 10.1109/TGRS.2016.2616585http://dx.doi.org/10.1109/TGRS.2016.2616585]
Wang P Q, Chen P F, Yuan Y, Liu D, Huang Z H, Hou X D and Cottrell G. 2018. Understanding convolution for semantic segmentation//Proceedings of 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, USA: IEEE: 1451-1460 [DOI: 10.1109/WACV.2018.00163http://dx.doi.org/10.1109/WACV.2018.00163]
Wu Y Q, Ji Y, Shen Y and Zhang Y F. 2012. Marine spill oil SAR image segmentation based on Tsallis entropy and improved Chan Vese model. Journal of Remote Sensing, 16(4): 678-690
吴一全, 吉玚, 沈毅, 张宇飞. 2012. Tsallis熵和改进CV模型的海面溢油SAR图像分割. 遥感学报, 16(4): 678-690 [DOI: 10.11834/jrs.20121192http://dx.doi.org/10.11834/jrs.20121192]
Yang X Z, Liu C J, Wu K W and Lang W H. 2014. SAR sea ice image segmentation using SRRG-MRF. Journal of Remote Sensing, 18(6): 1247-1257
杨学志, 刘灿俊, 吴克伟, 郎文辉. 2014. SRRG-MRF的SAR海冰图像分割算法. 遥感学报, 18(6): 1247-1257 [DOI: 10.11834/jrs.20143266http://dx.doi.org/10.11834/jrs.20143266]
Yu B, Meng J M, Zhang X and Ji Y G. 2013. Segmentation method for agglomerative hierarchical-based sea ice types using polarimetric SAR data. Journal of Remote Sensing, 17(4): 887-904
于波, 孟俊敏, 张晰, 纪永刚. 2013. 结合凝聚层次聚类的极化SAR海冰分割. 遥感学报, 17(4): 887-904 [DOI: 10.11834/jrs.20132091http://dx.doi.org/10.11834/jrs.20132091]
Yu F and Koltun V. 2016. Multi-scale context aggregation by dilated convolutions. International Conference on Learning Representations. 1-13.
Zeiler M D and Fergus R. 2014. Visualizing and understanding convolutional networks//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 818-833 [DOI: 10.1007/978-3-319-10590-1_53http://dx.doi.org/10.1007/978-3-319-10590-1_53]
Zhang Z X, Liu Q J and Wang Y H. 2018. Road extraction by deep residual U-Net. IEEE Geoscience and Remote Sensing Letters, 15(5): 749-753 [DOI: 10.1109/LGRS.2018.2802944http://dx.doi.org/10.1109/LGRS.2018.2802944]
Zhao H S, Shi J P, Qi X J, Wang X G and Jia J Y. 2017. Pyramid scene parsing network//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 6230-6239 [DOI: 10.1109/CVPR.2017.660http://dx.doi.org/10.1109/CVPR.2017.660]
Zheng W, Kang G W, Chen W F and Li X W. 2008. Unsupervised segmentation of remote sensing images based on fuzzy Markov random field. Journal of Remote Sensing, 12(2): 246-252
郑玮, 康戈文, 陈武凡, 李小文. 2008. 基于模糊马尔可夫随机场的无监督遥感图像分割算法. 遥感学报, 12(2): 246-252 [DOI: 10.11834/jrs.20080232http://dx.doi.org/10.11834/jrs.20080232]
Zhong Z Y, Jin L W and Xie Z C. 2015. High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps//Proceedings of 2015 13th International Conference on Document Analysis and Recognition. Tunis, Tunisia: IEEE: 846-850 [DOI: 10.1109/ICDAR.2015.7333881http://dx.doi.org/10.1109/ICDAR.2015.7333881]
Zhou M F and Wang X L. 2018. Object detection models of remote sensing images using deep neural networks with weakly supervised training method. Scientia Sinica Informationis, 48(8): 1022-1034
周明非, 汪西莉. 2018. 弱监督深层神经网络遥感图像目标检测模型. 中国科学: 信息科学, 48(8): 1022-1034 [DOI: 10.1360/N112017-00208http://dx.doi.org/10.1360/N112017-00208]
相关文章
相关作者
相关机构