Inversion of Hg content in reed leaf using continuous wavelet transformation and random forest
-
Abstract
Heavy metal pollution of plants is one of the most important eco-environmental problems in the world. Rapid and large-scale monitoring of heavy metal content in plants has always been an international problem and a key research topic. Due to its high resolution, multiple band and abundant data, hyperspectral technology could offer a rapid and accurate determination of heavy metal pollution in plants. It can be used to detect the absorption, reflection and transmission characteristics of spectral bands corresponding to phytochemical components and to quantitatively analyze weak spectral differences for large-scale determination of the growth and health of plants. However, researchers mostly construct sensitive spectral parameters (e.g., vegetation index) through simple spectral transformation techniques and continuous removal methods. Most of the inversion models are of univariate regression, multiple stepwise regression, principal component regression and other empirical or semi-empirical models. There have also been uses of artificial networks and support vector machine models. These models not only require more training sets, but also easily over-fit. Thus continuous wavelet transform (CWT) and Random Forest (RF) algorithms are used as more accurate models for inverting heavy metal pollution in plants. While CWT model can more clearly characterize spectral signals, RF has strong fitting ability and also has shorter iteration time. It has higher calculation efficiency for large datasets such as hyperspectral data and is superior in model construction. The heavy metal mercury (Hg) and the wetland plant reed (Phragmites communis) were used in this research to test the effectiveness off the CWT and RF models. CWT was used to decompose continuous wavelength at different scales in the original spectral reflectivity (R), first-order derivative reflectivity (FD) and de-envelope reflectivity (CR). Correlation analysis was used to determine sensitive bands of R, FD, CR, the spectral reflectance by continuous wavelet transform (R-CWT), the first derivative reflectivity by continuous wavelet transform (FD-CWT) and de-envelope reflectivity by continuous wavelet transform based on the correlation with leaf total Hg content. Then the sensitive bands and RF algorithm were used to establish the inversion model of reed leaf total Hg content. The results showed that sensitive bands of leaf total Hg content were mainly distributed in the visible regions of 419-522 nm, 664-695 nm and 724-876 nm, and the near-infrared regions of 1 450-1 558 nm and 1 972-2 500 nm. After CWT transformation, the absolute value of correlation coefficient between wavelet coefficient and leaf total Hg content increased by 0.04-0.18, the fitting effect (R2) of the prediction inversion model increased by 0.107-0.177 and the accuracy (RMSE) of the prediction inversion model increased by 0.008-0.013. The RF model which used continuum removal reflectance after wavelet transformation (CR-CWT) had optimal inversion precision and fitting effect (R2=0.713, RMSE=0.127). At the same time, it was more accurate and reliable to use RF model with CR-CWT to retrieve leaf total Hg content when soil total Hg content was about 20 mg·kg-1 (R2=0.825, RMSE=0.051). Therefore, it was feasible to use RF algorithm to retrieve heavy metal content in plants. The inversion model constructed by CWT had a more reference value in terms of monitoring heavy metal content in plants. The model was widely used and provided methodological support for non-destructive and rapid monitoring of heavy metal pollution in ecosystems.
-
-