高分辨率遥感影像建筑物提取的注意力胶囊网络算法

许正森; 管海燕; 彭代锋; 于永涛; 雷相达; 赵好好

doi:10.11834/jrs.20221577

遥感智能解译 | 浏览量 : 0 下载量: 329 CSCD: 1 更多指标

R-PDF
PDF
导出
分享
收藏
专辑

高分辨率遥感影像建筑物提取的注意力胶囊网络算法
A dual-attention capsule network for building extraction from high-resolution remote sensing imagery
2022年26卷第8期页码：1636-1649
纸质出版日期： 2022-08-07 ，
DOI： 10.11834/jrs.20221577

扫描看全文

许正森，管海燕，彭代锋，于永涛，雷相达，赵好好.2022.高分辨率遥感影像建筑物提取的注意力胶囊网络算法.遥感学报，26（8）： 1636-1649

Xu Z S，Guan H Y，Peng D F，Yu Y T，Lei X D and Zhao H H. 2022. A dual-attention capsule network for building extraction from high-resolution remote sensing imagery. National Remote Sensing Bulletin， 26（8）：1636-1649
许正森，管海燕，彭代锋，于永涛，雷相达，赵好好.2022.高分辨率遥感影像建筑物提取的注意力胶囊网络算法.遥感学报，26（8）： 1636-1649 DOI： 10.11834/jrs.20221577.

Xu Z S，Guan H Y，Peng D F，Yu Y T，Lei X D and Zhao H H. 2022. A dual-attention capsule network for building extraction from high-resolution remote sensing imagery. National Remote Sensing Bulletin， 26（8）：1636-1649 DOI： 10.11834/jrs.20221577.

摘要

高分辨率遥感影像建筑物自动提取在防灾减灾、灾害估损、城市规划和地形图制作等方面具有重要意义。但是，目前常用的传统卷积神经网络模型存在异变性强而同变性弱缺陷。针对该问题，本文提出一种基于通道和空间双注意力胶囊编码—解码网络DA-CapsNet（dual-attention capsule encoder-decoder network）的建筑物提取通用模型。该模型通过胶囊卷积和空间—通道双注意力模块增强高分辨率遥感影像中建筑物高阶特征表达能力，实现建筑物遮挡部分以及对非建筑不透水层的准确提取与区分。模型首先利用胶囊编码—解码结构提取并融合多尺度建筑物胶囊特征，获得高质量建筑物特征表达。之后，设计通道和空间注意力特征模块进一步增强建筑物上下文语义信息，提高模型性能。本文选取3种高分辨率建筑物数据集进行试验，最终的平均精度、召回率和F1-score分别为92.15%、92.07%和92.18%。结果表明，本文提出的DA-CapsNet能有效克服高分辨率遥感影像中的空间异质性、同物异谱、异物同谱以及阴影遮挡等影响，实现复杂环境下的高精度建筑物自动提取。

Abstract

Automatic extraction of buildings from high-resolution remote sensing images is greatly important in disaster prevention and mitigation

disaster loss estimation

urban planning

and topographic map making. With the advancement of optical remote sensing techniques in image resolutions and qualities

remote sensing images have provided an important data source for assisting the rapid updating of building footprint database. Despite the large number of algorithms proposed with enhanced performance

fulfilling highly accurate and fully automated extraction of buildings from remote sensing images is still difficult due to the considerable challenging scenarios of buildings

such as color diversities

topology variations

occlusions

and shadow covers. Thus

exploiting advanced and high-performance techniques to further improve the accuracy and automation level of building extraction is greatly meaningful and urgently required by a large variety of applications.

To overcome the issues of strong variability and weak homogeneity of traditional convolutional neural networks

we propose a novel dual-attention capsule encoder–decoder network DA-CapsNet for extracting buildings. In this network

a deep capsule encoder–decoder network

along with the channel-spatial attention blocks

is developed to enhance the capability of extracting high-level feature information from very high resolution remote sensed images. Thus

this model has the ability to extract buildings covered by shadows and discriminate buildings from non-building impervious surfaces. Specifically

we initially employ a deep capsule encoder–decoder network to extract and fuse multiscale building capsule features

resulting in a high-quality building feature representation. Moreover

spatial attention and channel attention modules are designed to further rectify and enhance the captured contextual information to obtain a competitive performance in processing buildings in the diverse challenging scenarios. The contributions include the following: (1) the deep capsule encoder–decoder network is designed to generate a high-quality feature representation; (2) the channel and spatial feature attention modules are designed to highlight channel-wise salient features and focus on class-specific spatial features.

The proposed DA-CapsNet was evaluated on three datasets: one Google Building Dataset and two publicly-available datasets (Wuhan and Massachusetts). The experimental results achieved a competitive performance with an average precision

recall

and F1-score of 92.15%

92.07%

and 92.18%

respectively

in handling buildings of varying challenging scenarios. Considering the overall accuracy of F1-score

the DA-CapsNet achieved the values of 92.70%

94.01%

and 89.84% for Google

WUH

and MA datasets

respectively. Comparative studies also confirmed the robust applicability and superior performance of the DA-CapsNet in building extraction tasks.

关键词

建筑物提取深度学习通道注意力空间注意力编码器—解码器胶囊网络

Keywords

building extractiondeep learningchannel feature attentionspatial feature attentionencoder-decoder networkcapsule network

references

Bi Q, Qin K, Zhang H, Zhang Y, Li Z L and Xu K. 2019. A multi-scale filtering building index for building extraction in very high-resolution satellite imagery. Remote Sensing, 11(5): 482 [DOI: 10.3390/rs11050482http://dx.doi.org/10.3390/rs11050482]

Chen J, Wang G B, Luo L B, Gong W P and Cheng Z. 2021. Building area estimation in drone aerial images based on Mask R-CNN. IEEE Geoscience and Remote Sensing Letters, 18(5): 891-894 [DOI: 10.1109/LGRS.2020.2988326http://dx.doi.org/10.1109/LGRS.2020.2988326]

Chen K Q, Gao X, Yan M L, Zhang Y and Sun X. 2020. Building extraction in pixel level from aerial imagery with a deep encoder-decoder network. Journal of Remote Sensing, 24(9): 1134-1142

陈凯强, 高鑫, 闫梦龙, 张跃, 孙显. 2020. 基于编解码网络的航空影像像素级建筑物提取. 遥感学报, 24(9): 1134-1142 [DOI: 10.11834/jrs.20209056http://dx.doi.org/10.11834/jrs.20209056]

Chen L C, Zhu Y K, Papandreou G, Schroff F and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation//15th European Conference on Computer Vision. Munich: Springer: 833-851 [DOI: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49]

Chen S X, Shi W Z, Zhou M T, Zhang M and Chen P F. 2020. Automatic building extraction via adaptive iterative segmentation with LiDAR data and high spatial resolution imagery fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13: 2081-2095 [DOI: 10.1109/JSTARS.2020.2992298http://dx.doi.org/10.1109/JSTARS.2020.2992298

Cui W H, Xiong B Y and Zhang L Y. 2019. Multi-scale fully convolutional neural network for building extraction. Acta Geodaetica et Cartographica Sinica, 48(5): 597-608

崔卫红, 熊宝玉, 张丽瑶. 2019. 多尺度全卷积神经网络建筑物提取. 测绘学报, 48(5): 597-608 [DOI: 10.11947/j.AGCS.2019.20180062http://dx.doi.org/10.11947/j.AGCS.2019.20180062]

Fan R S, Chen Y, Xu Q H and Wang J X. 2019. A high-resolution remote sensing image building extraction method based on deep learning. Acta Geodaetica et Cartographica Sinica, 48(1): 34-41

范荣双, 陈洋, 徐启恒, 王竞雪. 2019. 基于深度学习的高分辨率遥感影像建筑物提取方法. 测绘学报, 48(1): 34-41 [DOI: 10.11947/j.AGCS.2019.20170638http://dx.doi.org/10.11947/j.AGCS.2019.20170638]

Gao X J, Wang M W, Yang Y W and Li G Q. 2018. Building extraction from RGB VHR images using shifted shadow algorithm. IEEE Access, 6: 22034-22045 [DOI: 10.1109/ACCESS.2018.2819705http://dx.doi.org/10.1109/ACCESS.2018.2819705]

Guo M Q, Liu H, Xu Y Y and Huang Y. 2020. Building extraction based on U-Net with an attention block and multiple losses. Remote Sensing, 12(9): 1400 [DOI: 10.3390/rs12091400http://dx.doi.org/10.3390/rs12091400]

Hao L C, Zhang Y and Cao Z M. 2019. Active cues collection and integration for building extraction with high-resolution color remote sensing imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(8): 2675-2694 [DOI: 10.1109/JSTARS.2019.2926738http://dx.doi.org/10.1109/JSTARS.2019.2926738]

Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT: IEEE: 7132-7141 [DOI: 10.1109/CVPR.2018.00745http://dx.doi.org/10.1109/CVPR.2018.00745]

Hu R M, Huang X B and Huang Y C. 2014. An enhanced morphological building index for building extraction from high-resolution images. Acta Geodaetica et Cartographica Sinica, 43(5): 514-520

胡荣明, 黄小兵, 黄远程. 2014. 增强形态学建筑物指数应用于高分辨率遥感影像中建筑物提取. 测绘学报, 43(1): 514-520 [DOI: 10.13485/j.cnki.11-2089.2014.0084http://dx.doi.org/10.13485/j.cnki.11-2089.2014.0084]

Ji S P and Wei S Q. 2019. Building extraction via convolutional neural networks from an open remote sensing building dataset. Acta Geodaetica et Cartographica Sinica, 48(4): 448-459

季顺平, 魏世清. 2019. 遥感影像建筑物提取的卷积神经元网络与开源数据集方法. 测绘学报, 48(4): 448-459 [DOI: 10.11947/j.‍AGCS.‍2019.20180206http://dx.doi.org/10.11947/j.‍AGCS.‍2019.20180206]

Ji S P, Wei S Q and Lu M. 2019. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on Geoscience and Remote Sensing, 57(1): 574-586 [DOI: 10.1109/TGRS.2018.2858817http://dx.doi.org/10.1109/TGRS.2018.2858817]

Jing R, Gong Z N, Zhu W D, Guan H L, Zhao W J and Zhang T. 2020. Extraction of buildings from remote sensing imagery based on multi-scale SLIC-GMRF and FCNSVM. Journal of Remote Sensing, 24(1): 11-26

井然, 宫兆宁, 朱文定, 关鸿亮, 赵文吉, 张涛. 2020. 多尺度SLIC-GMRF与FCNSVM联合的高分影像建筑物提取. 遥感学报, 24(1): 11-26 [DOI: 10.11834/jrs.20208221http://dx.doi.org/10.11834/jrs.20208221]

Konstantinidis D, Stathaki T, Argyriou V and Grammalidis N. 2017. Building detection using enhanced HOG–LBP features and region refinement processes. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(3): 888-905 [DOI: 10.1109/JSTARS.2016.2602439http://dx.doi.org/10.1109/JSTARS.2016.2602439]

Li Q Y, Shi Y L, Huang X and Zhu X X. 2020. Building footprint generation by integrating convolution neural network with feature pairwise conditional random field (FPCRF). IEEE Transactions on Geoscience and Remote Sensing, 58(11): 7502-7519 [DOI: 10.1109/TGRS.2020.2973720http://dx.doi.org/10.1109/TGRS.2020.2973720]

Lin X G and Zhang J X. 2017. Object-based morphological building index for building extraction from high resolution remote sensing imagery. Acta Geodaetica et Cartographica Sinica, 46(6): 724-733

林祥国, 张继贤. 2017. 面向对象的形态学建筑物指数及其高分辨率遥感影像建筑物提取应用. 测绘学报, 46(6) 724-733 [DOI: 10.11947/j.AGCS.2017.20170068http://dx.doi.org/10.11947/j.AGCS.2017.20170068]

Mnih V. 2013. Machine Learning for Aerial Image Labeling. Toronto, ON: University of Toronto

Pan X R, Yang F, Gao L R, Chen Z C, Zhang B, Fan H R and Ren J C. 2019. Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms. Remote Sensing, 11(8): 917 [DOI: 10.3390/rs11080917http://dx.doi.org/10.3390/rs11080917]

Qin X B, He S D, Yang X C, Dehghan M, Qin Q M and Martin J. 2018. Accurate outline extraction of individual building from very high-resolution optical images. IEEE Geoscience and Remote Sensing Letters, 15(11): 1775-1779 [DOI: 10.1109/LGRS.2018.2857719http://dx.doi.org/10.1109/LGRS.2018.2857719]

Rajasegaran J, Jayasundara V, Jayasekara S, Jayasekara H, Seneviratne S and Rodrigo R. 2019. DeepCaps: going deeper with capsule networks//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA: IEEE: 10717-10725 [DOI: 10.1109/CVPR.2019.01098http://dx.doi.org/10.1109/CVPR.2019.01098]

Ronneberger O, Fischer P and Brox T. 2015. U-Net: convolutional networks for biomedical image segmentation//18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: Springer: 234-241 [DOI: 10.1007/978-3-319-24574-4_28http://dx.doi.org/10.1007/978-3-319-24574-4_28]

Sabour S, Frosst N and Hinton G E. 2017. Dynamic routing between capsules. arXiv: 1710.09829

Shao Z F, Tang P H, Wang Z Y, Saleem N, Yam S and Sommai C. 2020. BRRNet: a fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sensing, 12(6): 1050 [DOI: 10.3390/rs12061050http://dx.doi.org/10.3390/rs12061050]

Shelhamer E, Long J and Darrell T. 2017. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4): 640-651 [DOI: 10.1109/TPAMI.2016.2572683http://dx.doi.org/10.1109/TPAMI.2016.2572683]

Shi Y L, Li Q Y and Zhu X X. 2019. Building footprint generation using improved generative adversarial networks. IEEE Geoscience and Remote Sensing Letters, 16(4): 603-607 [DOI: 10.1109/LGRS.2018.2878486http://dx.doi.org/10.1109/LGRS.2018.2878486]

Shi Y L, Li Q Y and Zhu X X. 2020. Building segmentation through a gated graph convolutional neural network with deep structured feature embedding. ISPRS Journal of Photogrammetry and Remote Sensing, 159: 184-197 [DOI: 10.1016/j.isprsjprs.2019.11.004http://dx.doi.org/10.1016/j.isprsjprs.2019.11.004]

Wang S S, Hou X W and Zhao X. 2020. Automatic building extraction from high-resolution aerial imagery via fully convolutional encoder-decoder network with non-local block. IEEE Access, 8: 7313-7322 [DOI: 10.1109/ACCESS.2020.2964043http://dx.doi.org/10.1109/ACCESS.2020.2964043]

Wang Y H, Gu L J, Li X F and Ren R Z. 2021. Building extraction in multitemporal high-resolution remote sensing imagery using a multifeature LSTM network. IEEE Geoscience and Remote Sensing Letters, 18(9): 1645-1649 [DOI: 10.1109/LGRS.2020.3005018http://dx.doi.org/10.1109/LGRS.2020.3005018]

Wang Z Q, Zhou Y, Wang S X, Wang F T and Xu Z Y. 2021. House building extraction from high-resolution remote sensing images based on IEU-Net. Journal of Remote Sensing, 25(11): 2245-2254

王振庆, 周艺, 王世新, 王福涛, 徐知宇. 2021. IEU-Net高分辨率遥感影像房屋建筑物提取. 遥感学报, 25(11): 2245-2254 [DOI: 10.11834/jrs.20210042http://dx.doi.org/10.11834/jrs.20210042]

Woo S, Park J, Lee J Y and Kweon I S. 2018. CBAM: convolutional block attention module//15th European Conference on Computer Vision. Munich: Springer: 3-19 [DOI: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1]

Xu S B, Pan X J, Li E, Wu B Y, Bu S H, Dong W M, Xiang S M and Zhang X P. 2018. Automatic building rooftop extraction from aerial images via hierarchical RGB-D priors. IEEE Transactions on Geoscience and Remote Sensing, 56(12): 7369-7387 [DOI: 10.1109/TGRS.2018.2850972http://dx.doi.org/10.1109/TGRS.2018.2850972]

You Y F, Wang S Y, Wang B, Ma Y X, Shen M, Liu W H and Xiao L. 2019. Study on hierarchical building extraction from high resolution remote sensing imagery. Journal of Remote Sensing, 23(1): 125-136

游永发, 王思远, 王斌, 马元旭, 申明, 刘卫华, 肖琳. 2019. 高分辨率遥感影像建筑物分级提取. 遥感学报, 23(1): 125-136 [DOI: 10.11834/jrs.20197500http://dx.doi.org/10.11834/jrs.20197500]

Yu B S and Tao D C. 2019. Deep metric learning with tuplet margin loss//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE: 6489-6498 [DOI: 10.1109/ICCV.2019.00659http://dx.doi.org/10.1109/ICCV.2019.00659]

Yu Y T, Ren Y F, Guan H Y, Li D L, Yu C H, Jin S H and Wang L F. 2021. Capsule feature pyramid network for building footprint extraction from high-resolution aerial imagery. IEEE Geoscience and Remote Sensing Letters, 18(5) 895-899 [DOI: 10.1109/LGRS.2020.2986380http://dx.doi.org/10.1109/LGRS.2020.2986380]

Zeiler M D and Fergus R. 2014. Visualizing and understanding convolutional networks//13th European Conference on Computer Vision. Zurich: Springer: 818-833 [DOI: 10.1007/978-3-319-10590-1_53http://dx.doi.org/10.1007/978-3-319-10590-1_53]

Zhang Q, Huang X and Zhang G X. 2016. A morphological building detection framework for high-resolution optical imagery over urban areas. IEEE Geoscience and Remote Sensing Letters, 13(9): 1388-1392 [DOI: 10. 1109/LGRS. 2016. 2590481http://dx.doi.org/10.1109/LGRS.2016.2590481]

Zhu Q Q, Zhang Y N, Wang L Z, Zhong Y F, Guan Q F, Lu X Y, Zhang L P and Li D R. 2021. A Global Context-aware and Batch-independent Network for road extraction from VHR satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 175: 353-365[DOI: 10.1016/j.isprsjprs.2021.03.16http://dx.doi.org/10.1016/j.isprsjprs.2021.03.16]

文章被引用时，请邮件提醒。

提交

基于编解码网络的航空影像像素级建筑物提取

融合CNN与Transformer的高分辨率遥感影像建筑物双流提取模型

生成式知识迁移的SAR舰船检测

MtSCCD：面向深度学习的土地利用场景分类与变化检测数据集