基于特征注意力金字塔的遥感图像目标检测方法
Feature attention pyramid-based remote sensing image object detection method
- 2023年27卷第2期 页码:492-501
纸质出版日期: 2023-02-07
DOI: 10.11834/jrs.20235011
扫 描 看 全 文
浏览全部资源
扫码关注微信
纸质出版日期: 2023-02-07 ,
扫 描 看 全 文
汪西莉,梁正印,刘涛.2023.基于特征注意力金字塔的遥感图像目标检测方法.遥感学报,27(2): 492-501
Wang X L,Liang Z Y and Liu T. 2023. Feature attention pyramid-based remote sensing image object detection method. National Remote Sensing Bulletin, 27(2):
遥感图像场景复杂、目标大小不一、分布不均衡等特点增加了目标检测的难度,而适于检测不同尺度目标的特征金字塔融合不同深度的特征图时,没有考虑特征图各自的重要性,没有强调目标区域的特征,为此本文提出基于特征注意力金字塔的遥感图像目标检测方法FAPNet(Feature Attention Pyramid Network)。首先,使用通道拼接方式融合不同深度的特征图,给用于检测的特征图提供不同大小感受野的特征,并基于通道注意力对融合的特征图在通道维度重标定,根据特征图所负责检测目标的尺度自适应地调整不同大小感受野特征的权重,强化感受野大小与待检测目标尺度匹配度较高的特征,弱化匹配度较低的特征。其次,使用叠加的扩张空间金字塔池化结构,结合弱监督分割网络建模位置注意力,强化目标区域特征,弱化背景区域特征,进一步提升目标检测方法的性能。实验结果表明,相较于RetinaNet,针对汽车目标,所提方法在UCAS-AOD数据集和RSOD数据集上检测精度AP分别提升了3.41%和2.26%,针对多类目标所提方法在各目标上取得了较优的AP结果,且mAP结果优于其他比较方法。
The characteristics of remote sensing images
such as complex scenes
different sizes of targets and unbalanced distributions
increase the difficulty of target detection. However
feature pyramids that are suitable for detecting targets of different scales do not consider the importance of different feature maps when fusing the feature maps
let alone emphasize the features of target areas. For this purpose
this paper proposes a feature attention pyramid-based remote sensing image object detection method (namely
the feature attention pyramid network
FAPNet).
First
the feature maps of different depths are fused by channel concatenation
and the features of different sized receptive fields are provided for the feature maps used for detection. The channel attention mechanism is used to recalibrate the fused feature maps in the channel dimension. The feature maps from different depths are adaptively adjusted according to the scale of the object to be detected to strengthen the feature that matches highly between the size of the receptive field and the object to be detected and weaken the feature with a low degree of matching. Second
the weakly supervised attention module uses the superimposed atrous spatial pyramid pooling structure and convolutional segmentation module to model spatial attention weights to adjust the feature distribution of the feature map used for prediction
strengthen the object area feature
and weaken the background area feature
which further improves the performance of object detection methods.
The experimental results show that compared with RetinaNet
the proposed method improves the accuracy (AP) for car targets by 3.41% and 2.26% on the UCAS-AOD dataset and RSOD dataset
respectively
achieves better AP results on each target for multiclass targets and is superior to other comparative object detection methods on the mAP indicator for multitargets.
A feature attention pyramid-based remote sensing image object detection method is proposed in this paper. Its contribution lies in the designed feature attention pyramid module and weakly supervised attention module. With the new modules
the proposed method can extract target features more accurately in complex scenes with targets of different sizes by channel attention and spatial attention
thus improving the performance of detection. The experimental results show that the proposed method is superior to the RetinaNet and FAN methods and is more suitable for remote sensing image object detection tasks with complex scenes and multiscale targets.
遥感图像目标检测弱监督分割注意力机制特征金字塔
remote sensing imageobject detectionweakly supervised segmentationattention mechanismfeature pyramid
Chen L C, Papandreou G, Kokkinos I, Murphy K and Yuille A L. 2018. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834-848 [DOI: 10.1109/TPAMI.2017.2699184http://dx.doi.org/10.1109/TPAMI.2017.2699184]
Dai J F, Li Y, He K M and Sun J. 2016. R-FCN: object detection via region-based fully convolutional networks//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc.: 379-387
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE: 886-893 [DOI: 10.1109/CVPR.2005.177http://dx.doi.org/10.1109/CVPR.2005.177]
Fu C Y, Liu W, Ranga A, Tyagi A and Berg A C. 2017. DSSD: deconvolutional single shot detector. arXiv preprint arXiv: 1701.06659 [DOI: 10.48550/arXiv.1701.06659http://dx.doi.org/10.48550/arXiv.1701.06659]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Hu J, Shen L, Albanie S, Sun G and Wu E H. 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8): 2011-2023 [DOI: 10.1109/TPAMI.2019.2913372http://dx.doi.org/10.1109/TPAMI.2019.2913372]
Hu P Y and Ramanan D. 2017. Finding tiny faces//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: 1522-1530 [DOI: 10.1109/CVPR.2017.166http://dx.doi.org/10.1109/CVPR.2017.166]
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 2261-2269 [DOI: 10.1109/CVPR.2017.243http://dx.doi.org/10.1109/CVPR.2017.243]
Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 936-944 [DOI: 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106]
Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2020. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2): 318-327 [DOI: 10.1109/TPAMI.2018.2858826http://dx.doi.org/10.1109/TPAMI.2018.2858826]
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot multiBox detector//Proceedings of the 2016 14th European Conference on Computer Vision. Amsterdam: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]
Long Y, Gong Y P, Xiao Z E and Liu Q. 2017. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(5): 2486-2498 [DOI: 10.1109/TGRS.2016.2645610http://dx.doi.org/10.1109/TGRS.2016.2645610]
Redmon J and Farhad A. 2017. YOLO9000: better, faster, stronger//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 6517-6525 [DOI: 10.1109/CVPR.2017.690http://dx.doi.org/10.1109/CVPR.2017.690]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Singh B and Davis L S. 2018. An analysis of scale invariance in object detection-SNIP//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 3578-3587 [DOI: 10.1109/CVPR.2018.00377http://dx.doi.org/10.1109/CVPR.2018.00377]
Tayara H and Chong K T. 2018. Object detection in very high-resolution aerial images using one-stage densely connected feature pyramid network. Sensors, 18(10): 3341 [DOI: 10.3390/s18103341http://dx.doi.org/10.3390/s18103341]
Wang F, Jiang M Q, Qian C, Yang S, Li C, Zhang H G, Wang X G and Tang X O. 2017a. Residual attention network for image classification//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 6450-6458 [DOI: 10.1109/CVPR.2017.683http://dx.doi.org/10.1109/CVPR.2017.683]
Wang J F, Yuan Y and Yu G. 2017b. Face attention network: an effective face detector for the occluded faces. arxiv preprint arXiv: 1711.07246 [DOI: 10.48550/arXiv.1711.07246http://dx.doi.org/10.48550/arXiv.1711.07246]
Woo S, Park J, Lee J Y and Kweon I S. 2018. CBAM: convolutional block attention module//Proceedings of 2018 the 15th European Conference on Computer Vision. Munich: Springer: 3-19 [DOI: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1]
Yang A P, Lu L Y and Ji Z. 2020. Multi-feature concatenation network for object detection. Journal of Tianjin University (Natural Science and Engineering), 53(6): 647-652
杨爱萍, 鲁立宇, 冀中. 2020. 多层特征图堆叠网络及其目标检测方法. 天津大学学报(自然科学与工程技术版), 53(6): 647-652 [DOI: 10.11784/tdxbz201904007http://dx.doi.org/10.11784/tdxbz201904007]
Yu Y, Hua A, He X J, Yu S H, Zhong X and Zhu R F. 2020. Attention-based feature pyramid networks for ship detection of optical remote sensing image. Journal of Remote Sensing, 24(2): 107-115
于野, 艾华, 贺小军, 于树海, 钟兴, 朱瑞飞. 2020. A-FPN算法及其在遥感图像船舶检测中的应用. 遥感学报, 24(2): 107-115 [DOI: 10.11834/jrs.20208264http://dx.doi.org/10.11834/jrs.20208264]
Zhou K B, Zhang Z X, Gao C X and Liu J. 2021. Rotated feature network for multiorientation object detection of remote-sensing images. IEEE Geoscience and Remote Sensing Letters, 18(1): 33-37 [DOI: 10.1109/LGRS.2020.2965629http://dx.doi.org/10.1109/LGRS.2020.2965629]
Zhou P, Ni B B, Geng C, Hu J G and Xu Y. 2018. Scale-transferrable object detection//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 528-537 [DOI: 10.1109/CVPR.2018.00062http://dx.doi.org/10.1109/CVPR.2018.00062]
Zhu H G, Chen X G, Dai W Q, Fu K, Ye Q X and Jiao J B. 2015. Orientation robust object detection in aerial images using deep convolutional neural network//Proceedings of the 2015 IEEE International Conference on Image Processing. Quebec City, QC, Canada: IEEE: 3735-3739 [DOI: 10.1109/ICIP.2015.7351502http://dx.doi.org/10.1109/ICIP.2015.7351502]
相关作者
相关机构