Current position:
Classification method for identifying rapeseed and weeds using hyperspectral imaging

Classification method for identifying rapeseed and weeds using hyperspectral imaging

2024-12-04 13:39


Rapeseed is an important source of edible oil in my country, ranking first among the five major oil crops. my country is the world's main producer of rapeseed, and

its rapeseed area and yield rank first in the world. However, weeds are very harmful to rapeseed during its growth. Traditional chemical weeding pollutes the agric

ultural ecological environment, and the efficiency of herbicide use is relatively low. Therefore, the key to correctly identifying weeds is to achieve accurate spraying

of herbicides.


Hyperspectral imaging technology is a new technology that integrates image processing and spectral analysis. Image data can truly show the surface damage and

external characteristics of crops, while spectral data reflects the internal structure and composition of crops. Therefore, in recent years, hyperspectral imaging tech

nology has been increasingly used in weed classification and identification and non-destructive testing of agricultural product quality.


This work uses a variety of preprocessing methods and characteristic wavelength extraction methods to process the hyperspectral image data of rapeseed and weed

canopies, and establishes classification models based on the full spectrum and characteristic wavelengths respectively. By analyzing and comparing the results of

different classification models, the effects of different spectral acquisition times and different rapeseed varieties on weed classification and identification can be

obtained.


1 Experimental part


1.1 Samples


The rapeseed samples and four weeds used in the experiment are Echinochloa crus-galli, Hemeng, Ramulus velutipes and Bidens pilosa, all of which are common

weed species with great impact in rapeseed fields and have a similar growth cycle to rapeseed. Figure 1 shows the images of the samples used in the experiment.



1.2 Spectral image acquisition


A 400-1000nm hyperspectral camera is used, and the FS13 product of Hangzhou Caipu Technology Co., Ltd. can be used for related research. The spectral range is

400-1000nm, the wavelength resolution is better than 2.5nm, and up to 1200 spectral channels. The acquisition speed can reach 128FPS in the full spectrum, and

the highest after band selection is 3300Hz (supporting multi-region band selection).


The hyperspectral information collection of rapeseed and weeds was divided into three times, which were defined as 1, 2 and 3. In addition, during each data collection process, since rapeseed and weeds are growing, it is necessary to adjust the internal parameters such as camera exposure time and collection height to obtain the hyperspectral image with the least distortion. Table 1 shows the internal parameters of the hyperspectral imager during the three experiments.

1.3 Data feature extraction


For the hyperspectral images of rapeseed and weeds studied, the region of interest of each plant is extracted, and the entire sample area after removing the backg

round is taken as the ROI. The average spectrum of each band in the ROI is calculated and counted to achieve spectrum transformation for the final data processing.

One of the principles of extracting ROI is to remove the area other than the plant canopy as much as possible, so that the canopy area image and the background

image can be highly segmented. The method of segmenting the canopy is as follows:


(1) Analyze and compare the spectral information corresponding to the pixel points in the ROI area and the non-ROI area, find the band that can distinguish these

points, and finally select the 800 nm band image as the mask, and set the threshold for binarization to construct the mask;


(2) In the mask image, the background area becomes 0, and the sample information area becomes 1, and the original hyperspectral image is masked, so as to

achieve the effect of removing the background area. Figure 2 shows the ROI acquisition process based on the RGB pseudo-color image under the three bands of

hyperspectral 662, 554 and 450 nm.



1.4 Data processing method


1.4.1 Preprocessing method


Normal variable transformation, detrending, multivariate scattering correction, moving average smoothing, polynomial convolution smoothing, baseline correction

and normalization are used to preprocess the sample's hyperspectral data.


1.4.2 Characteristic wavelength extraction method


The principal component load, load coefficient method, regression coefficient method and continuous projection algorithm are used to extract characteristic wavel

engths. The principal component load selects characteristic wavelengths mainly because the load values under different principal components indicate the different

importance of wavelengths. Therefore, the peak and valley values in the principal component load diagram are selected as characteristic wavelengths; the load coef

ficient method (x-LW) selects characteristic wavelengths according to the x-LW curve, where the wavelength with a larger absolute value and a peak is the character

istic wavelength; the peak, trough and inflection point in the RC curve are the characteristic wavelengths selected by the regression coefficient method) [12]. SPA a

nalyzes the multiple continuous projections of the vectors contained in the data set to reduce the redundancy of the data content and improve the efficiency and

speed of calculation.


2 Results and discussion


2.1 Average spectral curves of rapeseed and weeds


Near infrared spectral data of 512 bands in the wavelength range of 380-1034 nm were collected. Since noise obviously affects the front and back ends of the spec

trum, the noise bands in the front and back ends were removed, and the spectrum of 380 bands between 453 and 934 nm was used for analysis. The average reflect

ance spectral curves of rapeseed and weeds were established as shown in Figure 3. As can be seen from the figure, rapeseed and the four weeds have similar trends,

the weed curves are more scattered, Bidens pilosa and rapeseed overlap, but there is a clear difference in reflectivity at the 550 nm peak.



The samples were divided into a modeling set and a prediction set in a ratio of 3:1, with 72 rapeseed and 48 weed samples in the modeling set and 24 rapeseed

and 16 weed samples in the prediction set. It should be noted that there are 12 samples of each of the four weeds in the modeling set and four samples of each

in the prediction set.


2.2 Principal component analysis Qualitative analysis


The principal component analysis of the spectral data of rapeseed and weeds showed that the cumulative contribution rate of PC1 and PC2 in the three experiments

was 99%, and PC1 and PC2 could explain most of the variables [4]. The distribution of the principal component scores of the three experiments is shown in Figure 4.

As can be seen from the figure, rapeseed and weeds are clustered together in the score map, which further shows that rapeseed and weeds can be effectively ident

ified. Next, we will continue to use spectral data for analysis and processing.




2.3 Classification and recognition results based on the whole band

The spectrum was preprocessed by de-trending, and the whole spectrum modeling was performed based on three algorithms. The recognition accuracy is shown in

Table 3. As shown in Table 3, the classification effect of the PLS algorithm in the second test did not reach 90.00%, and the SVM and ELM algorithms were better, es

pecially the ELM algorithm, which had the best classification accuracy, reaching 100.00% in all three tests.



2.4 Classification and recognition results of characteristic wavelengths


Based on the De-trending preprocessing, the reflectance corresponding to the characteristic wavelengths extracted by the above four methods is used as the input

variable. Since the discrimination effect of the ELM model based on the full spectrum is better between different batches of experiments, the ELM discrimination ana

lysis model is established based on the characteristic wavelength. The classification results are shown in Table 4.


As can be seen from Table 4, the four extraction methods all achieved relatively good classification results. The classification effect of the recognition model establis

hed by the characteristic wavelengths extracted by PCA loadings, x-loading weights and SPA is very good. The classification effect of the modeling set and the pre

diction set both reached 100.00%.