多尺度深度特征融合网络的遥感图像目标检测

范新南; 严炜; 史朋飞; 张学武

doi:10.11834/jrs.20210170

遥感智能解译 | 浏览量 : 0 下载量: 529 CSCD: 0 更多指标

R-PDF
PDF
导出
分享
收藏
专辑

多尺度深度特征融合网络的遥感图像目标检测
Remote sensing image target detection based on a multi-scale deep feature fusion network
2022年26卷第11期页码：2292-2303
纸质出版日期： 2022-11-07 ，
DOI： 10.11834/jrs.20210170

扫描看全文

范新南，严炜，史朋飞，张学武.2022.多尺度深度特征融合网络的遥感图像目标检测.遥感学报，26（11）： 2292-2303

Fan X N，Yan W，Shi P F and Zhang X W. 2022. Remote sensing image target detection based on a multi-scale deep feature fusion network. National Remote Sensing Bulletin， 26（11）：2292-2303
范新南，严炜，史朋飞，张学武.2022.多尺度深度特征融合网络的遥感图像目标检测.遥感学报，26（11）： 2292-2303 DOI： 10.11834/jrs.20210170.

Fan X N，Yan W，Shi P F and Zhang X W. 2022. Remote sensing image target detection based on a multi-scale deep feature fusion network. National Remote Sensing Bulletin， 26（11）：2292-2303 DOI： 10.11834/jrs.20210170.

摘要

本文针对现有方法对遥感图像目标检测准确率低的问题，在更快速区域卷积神经网络Faster R-CNN（Faster Region Convolutional Neural Networks）算法的基础上对其进行改进，提出一种新的遥感图像目标检测算法。该算法把Faster R-CNN算法中的VGG（Visual Geometry Group）特征提取网络替换为残差网络ResNet（Residual Networks），在此基础上加入特征金字塔网络以充分表达语义信息和位置信息，并使用焦点损失函数替代Faster R-CNN算法中的交叉熵损失函数以解决难易样本对总损失贡献的权重问题，最后对NWPU VHR-10数据集和RSOD数据集采用数据增广方法以解决数据集中图像样本数量少的问题。为验证本文算法的效果，进行了两组对比实验。第一组实验为本文提出的改进模块在NWPU VHR-10数据集和RSOD数据集上的消融实验；第二组实验为本文算法与其他算法在NWPU VHR-10数据集上的对比实验。实验结果表明，本文算法在NWPU VHR-10数据集和RSOD数据集上的多类平均准确率分别达到93.4%和93.0%，比Faster R-CNN算法提高了10.6%和7.8%。同时也高于现有的其他几种算法。

Abstract

Some existing target detection algorithms are insufficient for feature extraction in remote sensing images. They cannot solve the difficult problem of large target scale differences in remote sensing images

especially in detecting small targets

resulting in low average detection accuracy. In response to these problems

this paper uses the Faster Region Convolutional Neural Network algorithm as the basic algorithm. Furthermore

it combines the target characteristics in the remote sensing images to improve the basic algorithm. Finally

this paper proposes a new remote sensing image target detection algorithm. First

we use the Residual Network with more powerful feature extraction capabilities to replace the Visual Geometry Group network in the original algorithm. It can solve the shortcomings of the original algorithm’s insufficient feature extraction of the remote sensing images. The deep residual network adopts the identity mapping method

which not only ensures that the performance of the network will not degrade as the network deepens but also extracts deeper features. Second

we add a feature pyramid network to the algorithm to fully integrate feature maps of different scales. The feature map obtained in this way has high-level semantic and low-level detail information. Accordingly

it can take category and location information into account. This approach can greatly solve the difficult problem of large target scale differences in remote sensing images and improve the detection accuracy of small targets to a certain extent. In addition

we use the focal loss function to replace the cross entropy loss function in the original algorithm to solve the problem of the weight of the hard and easy samples to the total loss. Finally

given the problem that the used data set contains a small number of images

we use data augmentation to expand the dataset. This paper carries out two sets of comparative experiments to verify the effect of this algorithm. The first set of experiments is the ablation experiments on the NWPU VHR-10 dataset and RSOD-Dataset of the improved modules proposed in this paper. The second set of experiments is the comparison experiments of the algorithm in this paper and the other comparison algorithms on the NWPU VHR-10 dataset. The results of the first set of ablation experiments show that the various improved modules proposed in this paper can help improve the accuracy of target detection in remote sensing images. For the NWPU VHR-10 dataset

after adding the feature pyramid network

focal loss function

and data augmentation strategy

the algorithm in this paper improves mean Average Precision by 2.6%

4.8%

and 0.8%

respectively. Furthermore

on the RSOD dataset

the algorithm in this paper improves the mean Average Precision by 0.6%

1.6%

and 0.9%

respectively. Accordingly

the target detection accuracy rates of the algorithm in this paper can reach 93.4% and 93.0% on the NWPU VHR-10 dataset and RSOD-Dataset

respectively. The results of the second set of comparative experiments show that the target detection accuracy of the proposed algorithm is better than the comparison algorithm

further proving that the proposed algorithm has good performance in remote sensing image target detection. Finally

compared with BOW

COPD

RICNN

original Faster R-CNN

ODDP

and Mask R-CNN

the algorithm in this paper improves the mean Average Precision by 68.8%

12.7%

20.8%

10.6%

6.7%

and 9.5%

respectively. The remote sensing image target detection algorithm proposed in this paper can better solve the difficult problem of large differences in target scale in remote sensing images. It can improve the target detection accuracy of remote sensing images

especially the detection accuracy of small targets.

关键词

遥感图像目标检测特征提取网络特征金字塔网络损失函数数据增广

Keywords

remote sensing imageobject detectionfeature extraction networkfeature pyramid networkloss functiondata augmentation

references

Cheng G and Han J W. 2016. A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117: 11-28 [DOI: 10.1016/j.isprsjprs.2016.03.014http://dx.doi.org/10.1016/j.isprsjprs.2016.03.014]

Cheng G, Han J W, Guo L, Qian X L, Zhou P C, Yao X W and Hu X T. 2013. Object detection in remote sensing imagery using a discriminatively trained mixture model. ISPRS Journal of Photogrammetry and Remote Sensing, 85: 32-43 [DOI: 10.1016/j.isprsjprs.2013.08.001http://dx.doi.org/10.1016/j.isprsjprs.2013.08.001]

Cheng G, Han J W, Zhou P C and Guo L. 2014. Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing, 98: 119-132 [DOI: 10.1016/j.isprsjprs.2014.10.002http://dx.doi.org/10.1016/j.isprsjprs.2014.10.002]

Cheng G, Zhou P C and Han J W. 2016. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(12): 7405-7415 [DOI: 10.1109/TGRS.2016.2601622http://dx.doi.org/10.1109/TGRS.2016.2601622]

Girshick R. 2015. Fast R-CNN//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE: 1440-1448 [DOI: 10.1109/ICCV.2015.169http://dx.doi.org/10.1109/ICCV.2015.169]

Guo S J, Lu B and Lou S L. 2017. Ship detection and recognition based on spatial pyramid pooling LBP feature. Laser and Infrared, 47(6): 783-788

郭少军, 陆斌, 娄树理. 2017. 应用空间金字塔池化LBP特征的舰船检测识别. 激光与红外, 47(6): 783-788 [DOI: 10.3969/j.issn.1001-5078.2017.06.025http://dx.doi.org/10.3969/j.issn.1001-5078.2017.06.025]

He K M, Gkioxari G, Dollár P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE: 2980-2988 [DOI: 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]

He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE: 770-778 [DOI: 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90]

He K and Sun J. 2015. Convolutional neural networks at constrained time cost//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. United States: IEEE: 5353-5360. [DOI: 10.1109/CVPR.2015.7299173http://dx.doi.org/10.1109/CVPR.2015.7299173]

Li H Y, Li C G, An J B and Ren J L. 2019. Attention mechanism improves CNN remote sensing image object detection. Journal of Image and Graphics, 24(8): 1400-1408

李红艳, 李春庚, 安居白, 任俊丽. 2019. 注意力机制改进卷积神经网络的遥感图像目标检测. 中国图象图形学报, 24(8): 1400-1408 [DOI: 10.11834/jig.180649http://dx.doi.org/10.11834/jig.180649]

Lin T Y, Dollár P, Girshick R, He K M, Hariharan B and Belongie S. 2017a. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE: 936-944 [DOI: 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106]

Lin T Y, Goyal P, Girshick R, He K M and Dollár P. 2017b. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]

Liu W, Anguelov D, Erhan D, Szegedy C and Reed S. 2016. Ssd: Single shot multibox detector//European Conference on Computer Vision. Cham: Springer: 21-37 [DOI: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2]

Liu X B, Liu P, Cai Z H, Qiao Y L, Wang L and Wang M. 2021. Research progress of optical remote sensing image object detection based on deep learning. Acta Automatica Sinica, 47(9): 2078-2089

刘小波, 刘鹏, 蔡之华, 乔禹霖, 王凌, 汪敏. 2021. 基于深度学习的光学遥感图像目标检测研究进展. 自动化学报, 47(9): 2078-2089 [DOI: 10.16383/j.aas.c190455http://dx.doi.org/10.16383/j.aas.c190455]

Long Y, Gong Y P, Xiao Z F and Liu Q. 2017. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 55(5): 2486-2498 [DOI: 10.1109/TGRS.2016.2645610http://dx.doi.org/10.1109/TGRS.2016.2645610]

Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]

Shrivastava A, Gupta A and Girshick R. 2016. Training region-based object detectors with online hard example mining//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE : 761-769 [DOI: 10.1109/CVPR.2016.89http://dx.doi.org/10.1109/CVPR.2016.89]

Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv Preprint arXiv: 1409.1556 [DOI: 10.48550/arXiv.1409.1556http://dx.doi.org/10.48550/arXiv.1409.1556]

Srivastava R K, Greff K and Schmidhuber J. 2015. Highway networks. arXiv Preprint arXiv: 1505.00387 [DOI: 10.48550/arXiv.1505.00387http://dx.doi.org/10.48550/arXiv.1505.00387]

Sun H, Sun X, Wang H Q, Li Y and Li X J. 2012. Automatic target detection in high-resolution remote sensing images using spatial sparse coding bag-of-words model. IEEE Geoscience and Remote Sensing Letters, 9(1): 109-113

Wang S S, Wang M and Wang G Y. 2019. Deep neural network pruning based two-stage remote sensing image object detection. Journal of Northeastern University (Natural Science), 40(2): 174-179

王生生, 王萌, 王光耀. 2019. 基于深度神经网络剪枝的两阶段遥感图像目标检测. 东北大学学报(自然科学版), 40(2): 174-179 [DOI: 10.12068/j.issn.1005-3026.2019.02.005http://dx.doi.org/10.12068/j.issn.1005-3026.2019.02.005]

Weber J and Lefèvre S. 2008. A multivariate hit-or-miss transform for conjoint spatial and spectral template matching//Proceedings of the 3rd International Conference on Image and Signal Processing. Cherbourg-Octeville: Springer: 226-235 [DOI: 10.1007/978-3-540-69905-7_26http://dx.doi.org/10.1007/978-3-540-69905-7_26]

Xiao Z F, Liu Q, Tang G F and Zhai X F. 2015. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images. International Journal of Remote Sensing, 36(2): 618-644 [DOI: 10.1080/01431161.2014.999881http://dx.doi.org/10.1080/01431161.2014.999881]

Xu D and Wu Y. 2020. Improved YOLO-V3 with DenseNet for multi-scale remote sensing target detection. Sensors, 20(15): 4276 [DOI: 10.3390/s20154276http://dx.doi.org/10.3390/s20154276]

Xu S, Fang T, Li D R and Wang S W. 2010. Object classification of aerial images with bag-of-visual words. IEEE Geoscience and Remote Sensing Letters, 7(2): 366-370 [DOI: 10.1109/LGRS.2009.2035644http://dx.doi.org/10.1109/LGRS.2009.2035644]

Yao H G, Wang C, Yu J, Bai X J and Li W. 2020. Recognition of small-target ships in complex satellite images. Journal of Remote Sensing, 24(2): 116-125

姚红革, 王诚, 喻钧, 白小军, 李蔚. 2020. 复杂卫星图像中的小目标船舶识别. 遥感学报, 24(2): 116-125 [DOI: 10.11834/jrs.20208238http://dx.doi.org/10.11834/jrs.20208238]

Yao Q L, Hu X and Lei H. 2019. Aircraft detection in remote sensing imagery with multi-scale feature fusion convolutional neural networks. Acta Geodaetica et Cartographica Sinica, 48(10): 1266-1274

姚群力, 胡显, 雷宏. 2019. 基于多尺度融合特征卷积神经网络的遥感图像飞机目标检测. 测绘学报, 48(10): 1266-1274 [DOI: 10.11947/j.AGCS.2019.20180398http://dx.doi.org/10.11947/j.AGCS.2019.20180398]

Yao Y Q, Cheng G, Xie X X and Han J W. 2021. Optical remote sensing image object detection based on multi-resolution feature fusion. National Remote Sensing Bulletin, 25(5): 1124-1137

姚艳清, 程塨, 谢星星, 韩军伟. 2021. 多分辨率特征融合的光学遥感图像目标检测. 遥感学报, 25(5): 1124-1137 [DOI: 10.11834/jrs.20210505http://dx.doi.org/10.11834/jrs.20210505]

Yu Y, Ai H, He X J, Yu S H, Zhong X and Zhu R F. 2020. Attention-based feature pyramid networks for ship detection of optical remote sensing image. Journal of Remote Sensing, 24(2): 107-115

于野, 艾华, 贺小军, 于树海, 钟兴, 朱瑞飞. 2020. A-FPN算法及其在遥感图像船舶检测中的应用. 遥感学报, 24(2): 107-115 [DOI: 10.11834/jrs.20208264http://dx.doi.org/10.11834/jrs.20208264]

Zhang D W, Han J W, Cheng G, Liu Z B, Bu S H and Guo L. 2015. Weakly supervised learning for target detection in remote sensing images. IEEE Geoscience and Remote Sensing Letters, 12(4): 701-705 [DOI: 10.1109/LGRS.2014.2358994http://dx.doi.org/10.1109/LGRS.2014.2358994]

Zhang L, Zhang Y S, YU Y, Geng Y L and Wang H. 2019. Research on data augmentation for object detection of remote sensing image. Journal of Geomatics Science and Technology, 36(5): 505-510

张磊, 张永生, 于英, 耿彦龙, 王贺. 2019. 遥感图像目标检测的数据增广研究. 测绘科学技术学报, 36(5): 505-510 [DOI: 10.3969/j.issn.1673-6338.2019.05.012http://dx.doi.org/10.3969/j.issn.1673-6338.2019.05.012]

Zhou D Y, Zeng L N and Zhang K. 2015. A novel SAR target detection algorithm via multi-scale SIFT features. Journal of Northwestern Polytechnical University, 33(5): 867-873

周德云, 曾丽娜, 张堃. 2015. 基于多尺度SIFT特征的SAR目标检测. 西北工业大学学报, 33(5): 867-873 [DOI: 10.3969/j.issn.1000-2758.2015.05.027http://dx.doi.org/10.3969/j.issn.1000-2758.2015.05.027]

Zhou P C, Cheng G, Yao X W and Han J W. 2021. Machine learning paradigms in high-resolution remote sensing image interpretation. Journal of Remote Sensing, 25(1): 182-197

周培诚, 程塨, 姚西文, 韩军伟 . 2021. 高分辨率遥感影像解译中的机器学习范式 . 遥感学报, 25(1): 182-197 [DOI: 10.11834/jrs.20210164http://dx.doi.org/10.11834/jrs.20210164]

Zou Z X and Shi Z W. 2016. Ship detection in spaceborne optical image with SVD networks. IEEE Transactions on Geoscience and Remote Sensing, 54(10): 5832-5845 [DOI: 10.1109/TGRS.2016.2572736http://dx.doi.org/10.1109/TGRS.2016.2572736]

文章被引用时，请邮件提醒。

提交

改进CenterNet在遥感图像目标检测中的应用

基于特征注意力金字塔的遥感图像目标检测方法

改进Faster R-CNN的遥感图像多尺度飞机目标检测

生成式知识迁移的SAR舰船检测