面向精细化多尺度特征的遥感图像目标检测
Refined multi-scale feature-oriented object detection of the remote sensing images
- 2022年26卷第12期 页码:2616-2628
纸质出版日期: 2022-12-07
DOI: 10.11834/jrs.20221801
扫 描 看 全 文
浏览全部资源
扫码关注微信
纸质出版日期: 2022-12-07 ,
扫 描 看 全 文
张省,李山山,魏国芳,张新耐,高建威.2022.面向精细化多尺度特征的遥感图像目标检测.遥感学报,26(12): 2616-2628
Zhang S, Li S S, Wei G F, Zhang X N and Gao J W. 2022. Refined multi-scale feature-oriented object detection of remote sensing images. National Remote Sensing Bulletin, 26(12):2616-2628
遥感图像目标检测是对目标视觉特征的描述与图像先验知识的表达,解译得到的信息无论在军事领域还是在民用领域都有着广泛的应用。针对复杂场景下遥感图像目标特征提取能力不足,目标尺度差异较大、方向任意且紧密排列,传统目标检测所使用的水平框难以准确定向等问题,提出了一种精细化多尺度特征的遥感图像定向目标检测算法。首先,设计了一种基于空洞卷积的上下文注意力网络,能够利用不同空洞率的卷积核捕获局部和全局语义信息,并利用注意力机制将语义信息整合到原始特征上,提升目标特征提取能力;其次,提出了一个精细化的特征金字塔网络,通过像素混洗的方式减少特征金字塔中的通道信息损失,强化网络对差异性大的多尺度目标特征信息的理解能力;最后,研究利用滑动顶点的方式回归定向的矩形框,更好地表示遥感图像内有向目标的位置。本文以Fast R-CNN OBB为基准,通过在目标检测公开数据集DOTA和HRSC2016上验证了算法的有效性,结果显示本文算法在DOTA数据集上与基准算法比较,平均精度(mAP)提升了22.65%,最终检测精度mAP达到了76.78%。在HRSC2016数据集上,最终检测精度mAP达到了89.95%。此外,本文算法较多种先进算法相比均有具有较好的提升。
Object detection of remote sensing image is the description of visual features of the object and the expression of the image prior knowledge
and the information obtained by the interpretation has a wide range of applications in both military and civilian fields. A refined multi-scale feature-oriented object detection of remote sensing image is proposed to address the problems of insufficient feature extraction capability of remote sensing image objects in complex scenes
large variations in object scales
arbitrary and closely arranged directions
and difficulties in the accurate orientation of horizontal frames used in traditional object detection.
First
a contextual attention network based on dilated convolution is designed
which can capture local and global semantic information by using convolution kernels with different dilated rates and integrate semantic information into the original features utilizing an attention mechanism to enhance feature extraction. Second
a refined feature pyramid network is proposed to reduce the loss of channel information in the feature pyramid by pixel shuffling and strengthen the network’s ability to understand multi-scale object feature information with large variances. Finally
the study uses gliding vertices to regress the oriented rectangular box to represent the location of directed objects within remote sensing images.
In this work
the effectiveness of the algorithm is verified by using Fast R-CNN OBB as a baseline on the object detection public datasets DOTA and HRSC2016. Results show that the algorithm in this work improves the mean average precision (mAP) by 22.65% on the DOTA dataset compared with the baseline. The final detection accuracy mAP reaches 76.78%. The final detection accuracy mAP on the HRSC2016 dataset reached 89.95%. In addition
the algorithm in this work has a better improvement compared with the various advanced algorithms.
Conclusion
2
First
the contextual attention network with dilated convolution is used to strengthen the object features
which enhances the discriminative ability of the convolutional neural network for objects and backgrounds in remote sensing images. Second
the refined feature pyramid is used to solve the problem of large variation of objects in remote sensing images. Finally
the direction factor of gliding vertices is introduced to represent the oriented objects
which reduces the regression boundedness problem that can be brought by angle regression.
遥感深度学习目标检测特征提取多尺度特征金字塔定向回归框
remote sensingdeep learningobject detectionfeature extractionmulti-scale feature pyramidoriented bounding box
Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation. ArXiv Preprint ArXiv: 1706.05587
Chen Z M, Chen K A, Lin W Y, See J, Yu H, Ke Y and Yang C. 2020. PIoU loss: towards accurate oriented object detection in complex environments//16th European Conference on Computer Vision. Glasgow: Springer: 195-211 [DOI: 10.1007/978-3-030-58558-7_12http://dx.doi.org/10.1007/978-3-030-58558-7_12]
Ding J, Xue N, Long Y, Xia G S and Lu Q K. 2019. Learning RoI transformer for oriented object detection in aerial images//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA: IEEE: 2844-2853 [DOI: 10.1109/CVPR.2019.00296http://dx.doi.org/10.1109/CVPR.2019.00296]
Feng P M, Lin Y T, Guan J, He G J, Shi H F and Chambers J. 2020. TOSO: student's-T distribution aided one-stage orientation target detection in remote sensing images//2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona: IEEE: 4057-4061 [DOI: 10.1109/ICASSP40776.2020.9053562http://dx.doi.org/10.1109/ICASSP40776.2020.9053562]
Girshick R. 2015. Fast R-CNN//2015 IEEE International Conference on Computer Vision. Santiago: IEEE: 1440-1448 [DOI: 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]
Girshick R, Donahue J, Darrell T and Malik J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE: 580-587 [DOI: 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81]
Guo Z H, Liu C, Zhang X S, Jiao J B, Ji X Y and Ye Q X. 2021. Beyond bounding-box: convex-hull feature adaptation for oriented and densely packed object detection//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN: IEEE: 8788-8797 [DOI: 10.1109/CVPR46437.2021.00868http://dx.doi.org/10.1109/CVPR46437.2021.00868]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]
Jiang Y Y, Zhu X Y, Wang X B, Yang S L, Li W, Wang H, Fu P and Luo Z B. 2018. R2CNN: rotational region CNN for orientation robust scene text detection//2018 IEEE International Conference on Pattern Recognition (ICPR). Beijing, China: IEEE: 3610-3615 [DOI: 10.1109/ICPR.2018.8545598http://dx.doi.org/10.1109/ICPR.2018.8545598]
Lin T Y, Dollar P, Girshick R, He K M, Hariharan B and Belongie S. 2017a. Feature pyramid networks for object detection//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI: IEEE: 936-944 [DOI: 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106]
Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2017b. Focal loss for dense object detection//2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Liu C and Zhu W G. 2021. Ship detection in SAR imagery under multi-scale and complex-background conditions. Remote Sensing Information, 36(3): 50-57
刘畅, 朱卫纲. 2021. 多尺度与复杂背景条件下的SAR图像船舶检测. 遥感信息, 36(3): 50-57 [DOI: 10.3969/j.issn.1000-3177.2021.03.008http://dx.doi.org/10.3969/j.issn.1000-3177.2021.03.008]
Liu S, Zhang L, Lu H C and He Y. 2021. Center-boundary dual attention for oriented object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 60: 5603914 [DOI: 10.1109/TGRS.2021.3069056http://dx.doi.org/10.1109/TGRS.2021.3069056]
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y and Berg A C. 2016. SSD: single shot MultiBox detector//14th European Conference on Computer Vision. Amsterdam: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]
Luo Y H, Cao X, Zhang J T, Guo J J, Shen H B, Wang T J and Feng Q. 2022. CE-FPN: enhancing channel information for object detection. Multimed Tools and Applications, 81:30685-30704.[DOI: 10.1007/s11042-022-11940-1http://dx.doi.org/10.1007/s11042-022-11940-1]
Ma J Q, Shao W Y, Ye H, Wang L, Wang H, Zheng Y B and Xue X Y. 2018. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11): 3111-3122 [DOI: 10.1109/tmm.2018.2818020http://dx.doi.org/10.1109/tmm.2018.2818020]
Ming Q, Zhou Z Q, Miao L J, Zhang H W and Li L H. 2021. Dynamic anchor learning for arbitrary-oriented object detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3): 2355-2363 [DOI: 10.1609/aaai.v35i3.16336http://dx.doi.org/10.1609/aaai.v35i3.16336]
Pan X J, Ren Y Q, Sheng K K, Dong W M, Yuan H L, Guo X W, Ma C Y and Xu C S. 2020. Dynamic refinement network for oriented and densely packed object detection//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA: IEEE: 11204-11213 [DOI: 10.1109/CVPR42600.2020.01122http://dx.doi.org/10.1109/CVPR42600.2020.01122]
Qian W, Yang X, Peng S L, Yan J C and Guo Y. 2021. Learning modulated loss for rotated object detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3): 2458-2466 [DOI: 10.1609/aaai.v35i3.16347http://dx.doi.org/10.1609/aaai.v35i3.16347]
Qin D D, Wan L, He P E, Zhang Y, Guo Y and Chen J. 2022. Multiscale object detection in remote sensing image by combining data fusion and feature selection. National Remote Sensing Bulletin, 26(8): 1662-1673
秦登达, 万里, 何佩恩, 张轶, 郭亚, 陈杰. 2022. 结合数据融合与特征选择的遥感影像尺度多样目标检测. 遥感学报, 26(8): 1662-1673 [DOI:10.11834/jrs.20221249http://dx.doi.org/10.11834/jrs.20221249]
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV: IEEE: 779-788 [DOI: 10.1109/CVPR.2016.91http://dx.doi.org/10.1109/CVPR.2016.91]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence,39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Shi W X, Tan D L and Bao S L. 2020. Feature enhancement SSD algorithm and its application in remote sensing images target detection. Acta Photonica Sinica, 49(1): 0128002
史文旭, 谭代伦, 鲍胜利. 2020. 特征增强SSD算法及其在遥感目标检测中的应用. 光子学报, 49(1): 154-163 [DOI: 10.3788/gzxb20204901.0128002http://dx.doi.org/10.3788/gzxb20204901.0128002]
Shi W Z, Caballero J, Huszár F, Totz J, Aitken A P, Bishop R, Rueckert D and Wang Z H. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV: IEEE: 1874-1883 [DOI: 10.1109/CVPR.2016.207http://dx.doi.org/10.1109/CVPR.2016.207]
Van Etten A. 2018. You only look twice: rapid multi-scale object detection in satellite imagery. ArXiv Preprint ArXiv: 1805.09512
Wang J W, Ding J, Guo H W, Cheng W S, Pan T and Yang W. 2019. Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing, 11(24): 2930 [DOI: 10.3390/rs11242930http://dx.doi.org/10.3390/rs11242930]
Wang J W, Yang W, Li H C, Zhang H J and Xia G S. 2021. Learning center probability map for detecting objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 59(5): 4307-4323 [DOI: 10.1109/TGRS.2020.3010051http://dx.doi.org/10.1109/TGRS.2020.3010051]
Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, Datcu M, Pelillo M and Zhang L P. 2018. DOTA: a large-scale dataset for object detection in aerial images//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT: IEEE: 3974-3983 [DOI: 10.1109/CVPR.2018.00418http://dx.doi.org/10.1109/CVPR.2018.00418]
Xi X S, Xia K, Yang Y H, Du X C and Feng H L. 2022. Urban individual tree crown detection research using multispectral image dimensionality reduction with deep learning. National Remote Sensing Bulletin, 26(4): 711-721
奚祥书, 夏凯, 杨垠晖, 杜晓晨, 冯海林. 2022. 结合多光谱影像降维与深度学习的城市单木树冠检测. 遥感学报, 26(4): 711-721 [DOI:10.11834/jrs.20220163http://dx.doi.org/10.11834/jrs.20220163]
Xu Y C, Fu M T, Wang Q M, Wang Y K, Chen K, Xia G S and Bai X. 2021. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4): 1452-1459 [DOI: 10.1109/TPAMI.2020.2974745http://dx.doi.org/10.1109/TPAMI.2020.2974745]
Yang X, Hou L P, Zhou Y, Wang W T and Yan J C. 2021a. Dense label encoding for boundary discontinuity free rotation detection//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, TN: IEEE: 15814-15824 [DOI: 10.1109/CVPR46437.2021.01556http://dx.doi.org/10.1109/CVPR46437.2021.01556]
Yang X and Yan J C. 2020. Arbitrary-oriented object detection with circular smooth label//16th European Conference on Computer Vision. Glasgow: Springer: 677-694 [DOI: 10.1007/978-3-030-58598-3_40http://dx.doi.org/10.1007/978-3-030-58598-3_40]
Yang X, Yang J R, Yan J C, Zhang Y, Zhang T F, Guo Z, Sun X and Fu K. 2019. Scrdet: Towards more robust detection for small, cluttered and rotated objects//2019 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE: 8232-8241 [DOI: 10.1109/iccv.2019.00832http://dx.doi.org/10.1109/iccv.2019.00832]
Yang X, Yan J C, Feng Z M and He T. 2021b. R3Det: refined single-stage detector with feature refinement for rotating object. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4): 3163-3171
Yao Q L, Hu X and Lei H. 2019. Object detection in remote sensing images using multiscale convolutional neural networks. Acta Optica Sinica, 39(11): 1128002
姚群力, 胡显, 雷宏. 2019. 基于多尺度卷积神经网络的遥感目标检测研究. 光学学报, 39(11): 1128002 [DOI: 10.3788/aos201939.1128002http://dx.doi.org/10.3788/aos201939.1128002]
Yao Y Q, Cheng G, Xie X X and Han J W. 2021. Optical remote sensing image object detection based on multiresolution feature fusion. National Remote Sensing Bulletin, 25(5): 1124-1137
姚艳清, 程塨, 谢星星, 韩军伟. 2021. 多分辨率特征融合的光学遥感图像目标检测. 遥感学报, 25(5): 1124-1137 [DOI:10.11834/jrs.20210505http://dx.doi.org/10.11834/jrs.20210505]
Yu F and Koltun V. 2016. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv: 1511.07122
Yu Y, Ai H, He X J, Yu S H, Zhong X and Zhu R F. 2020. Attention-based feature pyramid networks for ship detection of optical remote sensing image. Journal of Remote Sensing (Chinese) 24(2): 107-115
于野, 艾华, 贺小军, 于树海, 钟兴, 朱瑞飞. 2020. A-FPN算法及其在遥感图像船舶检测中的应用. 遥感学报, 24(2): 107-115 [DOI:10.11834/jrs.20208264http://dx.doi.org/10.11834/jrs.20208264]
Zhang G J, Lu S J and Zhang W. 2019. CAD-Net: a context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 57(12): 10015-10024 [DOI: 10.1109/TGRS.2019.2930982http://dx.doi.org/10.1109/TGRS.2019.2930982]
Zhou Y, Chen S L, Zhao J Q, Zhang D and Wang H Z. 2021. Weakly semantic based attention network for interpretable object detection in remote sensing imagery. Acta Electronica Sinica, 49(4): 679-689
周勇, 陈思霖, 赵佳琦, 张迪, 王瀚正. 2021. 基于弱语义注意力的遥感图像可解释目标检测. 电子学报, 49(4): 679-689 [DOI: 10.12263/dzxb.20200554http://dx.doi.org/10.12263/dzxb.20200554]
Zhu Y X, Du J and Wu X Q. 2020. Adaptive period embedding for representing oriented objects in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 58(10): 7247-7257 [DOI: 10.1109/TGRS.2020.2981203http://dx.doi.org/10.1109/TGRS.2020.2981203]
相关作者
相关机构