HIGHLIGHTS

Estimation of soil organic carbon content using Vis-NIR spectroscopy.

Compare different feature band selection methods.

Improvement of the way genetic algorithms are coded.

ABSTRACT

Soil organic carbon (SOC) content is a critical parameter for evaluating soil health. However, high redundancy and invalid information in soil hyperspectral data can reduce the accuracy and stability of SOC prediction models. This study developed a global partial least squares regression (PLSR) model and a local PLSR model for agricultural soils in the LUCAS 2015 database. Some variable selection methods were combined with the regression models and their effects on prediction accuracy were explored. In addition, when the genetic algorithm is utilized for spectral feature selection, we obtained a more representative spectral subset through a novel coding approach. The results illustrated that the best SOC estimation accuracy was achieved by the local PLSR combined with a coding-improved genetic algorithm (GA), with R2 of 0.71, RMSEP of 5.7 g kg-1, and RPD of 1.87. This study demonstrates that appropriate spectral band selection only slightly enhances the model performance of both global and local regressions, as PLSR models using the full spectrum show similar performance. Local PLSR models consistently outperform global ones using full spectrum or variable selection algorithms.

No tags for this post.