This article takes navel oranges as the experimental object, preprocesses and reduces the dimensionality of the collected hyperspectral data, and extracts their characteristic wavelengths. Establish a mathematical model using dimensionality reduced hyperspectral data to distinguish pesticide residue concentrations in samples and analyze the performance of different models。
Selecting pyridaben pesticide as the experimental object, wash the selected 80 navel orange samples and place them in a ventilated place until they dry completely. Then randomly divide it into 4 groups, with 20 samples per group for each concentration (1:400, 1:800, 1:1500, and without pesticides). Similarly, we identified the region of interest for navel oranges and obtained the average spectrum within that region as the raw spectral data, as shown in the figure.
Extracting the spectra of interest for each type of sample yields 20 sets of original spectra for four types of samples. The following figure shows one set of original spectra with a concentration of 1:400. Obtain 20 original spectra for each of the four groups of samples obtained. The average spectral curves obtained by averaging 20 original spectra of each of the four types of samples using standard normal transformation (SNV) are shown in the figure.
From the graph, it can be seen that there are significant differences in the spectral reflectance curves of navel oranges with different concentrations of pesticide residues. The average spectral reflectance of different concentrations of pesticides and distilled water groups shows a similar trend in the range of 400-1000nm. Moreover, there are significant differences in reflectance in some bands, and the curves overlap in some bands, making it difficult to identify the differences in spectral reflectance. It can be seen that there is a reflectance valley in the spectrum at 670nm and 980nm, while there is a smaller reflectance peak at 700nm. The reflectance curves of the 400-650nm group basically overlap, while the reflectance curves of the distilled water and pyrimethanil groups overlap at 900-1000nm. However, there are significant differences in the reflectance curves of each group at 650-890nm.
2.1 Feature Extraction Based on Principal Component Analysis
By using principal component analysis algorithm to extract features from preprocessed raw spectral data, principal component images and cumulative contribution rates of each principal component are obtained. Generally speaking, PC-1 has the most original image information, and the first few principal components contain 99% of spectral information. As shown in the table, the contribution rate of the first four principal components can reach 99.87%.
Using ENVI software to extract PCA images of experimental samples, each principal component image is theoretically formed by linear operation of the grayscale image digital matrix at each wavelength in the original data. Through ENVI software calculation, the spectral data of each different principal component can be obtained, including the covariance, correlation coefficient, and eigenvector of each principal component. In general, PC-1 contains the most raw information, with the first four principal component images shown in the figure.
By extracting the feature vectors of all bands in the PC-2 and PC-3 images, we can draw the weight coefficient maps of the hyperspectral principal component images PC-2 and PC-3 as shown in Figure 4.5. In terms of the weight curve. We believe that the wavelengths corresponding to the peaks and valleys are the characteristic wavelengths with the best information content. It can be seen that the characteristic wavelengths of PC-2 are 500nm, 680nm, and 980nm; The selected characteristic wavelengths for PC-3 are 500nm, 580nm, 850nm, and 930nm. Select 500nm, 580nm, 680850nm, 930nm, and 980nm as characteristic wavelengths.
This article conducts spectral extraction and pretreatment on four groups of navel orange samples sprayed with different concentrations of pyrimethanil pesticide on the surface. Principal component analysis (PCA) and continuous projection analysis (SPA) are used to reduce the dimensionality of the hyperspectral full band to extract feature wavelengths. Then, support vector machine (SVM), BP neural network, and extreme learning machine (ELM) classification models are established based on the extracted feature wavelengths. The experimental data results of different dimensionality reduction methods and classification models are analyzed to select the optimal model for distinguishing different concentrations on the surface of navel oranges.