ZHAO Lijun, TANG Ping. Scalability analysis of typical remote sensing data classification methods: A case of remote sensing image scene[J]. Journal of Remote Sensing, 2016, 20(2): 157-171. DOI: 10.11834/jrs.20164279.
The classification of remote sensing data plays an important role in all stages of remote sensing data processing and analysis.With the increase in the volume of remote sensing data
new problems concerning remote sensing big data classification tasks arise. Currently
the commonly used classifiers are usually designed for simple tasks to provide satisfactory results. However
for the processing of large volumes of remote sensing data
the scalability of classification efficiency and precision should be further investigated. Therefore
this study emphasizes on the comparisons of the scalability of typical remote sensing data classification methods to achieve this goal. Method:This study takes remote sensing image scene classification as an example and selects four well-known classification methods for comparison
namely
K Nearest Neighbor(KNN)
Random Forest(RF)
Support Vector Machine(SVM)
and Sparse Representation-based Classifier(SRC)
to conduct scalability analysis. The comparisons are conducted in terms of parameter sensitivity
effect of training sample data volume on classifier performance
effect of testing sample data volume on classifier performance
and effect of feature dimension on classifier performance. Results: The experimental results are as below:(1) The classifiers of KNN
RF
and L0-SRC are less parameter-sensitive than the classifiers of RBF-SVM
Linear-SVM
and L1-SRC.(2) In cases where the samples to be classified are fixed
all the classifiers tend to increase with the increase in the number of training samples. The SRC-type classification methods have the highest accuracy
followed by the SVM-type classification methods
the RF
and the KNN classifiers. In terms of overall classification time
the results show that the methods can be ranked as below: L0-SRC > L1-SRC > RF > RBF-SVM/Linear-SVM > KNN/L0-SRC-Batch.(3) In cases where the training samples are fixed
the classification accuracies of all the classifiers are seldom affected by the number of samples to be classified
which may be due to the learning abilities of all the different classifiers.(4) The feature dimension affects the efficiency and accuracy of different classifiers
in which SRC and KNN can obtain satisfactory results without high feature dimensions. SVM is tolerant to high feature dimensions and has a good learning ability with such high feature dimensions. By contrast
RF is insensitive to the increase in feature dimensions
and higher feature dimensions do not contribute much to the improvement of classification performance. Under such circumstances
the RBFSVM exhibits the best performance
followed by the L1-SRC classifier
the Linear–SVM classifier
and the RF and L0-SRC/L0-SRC-Batch classifiers. In terms of overall classification time
the classifiers of L1-SRC and L0-SRC are the most time-consuming
whereas the other classifiers have relatively higher efficiency. Conclusion: Different classification methods have different advantages and disadvantages. In the tasks of classifying a large volume of remote sensing data
the selection of classifiers should be balanced and based on their characteristics and practical applications. Generally
a classifier that is less parameter-sensitive and less time-consuming during classification and obtains more accurate classification results is preferable.