Current position:
Nondestructive detection of blueberry sugar content based on hyperspectral imaging

Nondestructive detection of blueberry sugar content based on hyperspectral imaging

2024-11-07 09:28

Blueberry has delicate flesh and unique flavor. It is rich in nutrients and is known as the "Queen of Fruits". It has the functions of preventing brain nerve aging, protecting eyesight, anti-cancer, and enhancing human immunity. It has broad market prospects. Blueberry sugar content is an important indicator for evaluating blueberry quality. Traditional blueberry sugar content detection is destructive, and non-destructive detection is an important development trend.


1. Image data acquisition

High-spectral image of blueberry samples



Extract the spectral data of the two hyperspectral images: Select different regions of interest (ROI) on the surface of each sample and obtain the original reflectance spectrum curve


Corresponding to the original spectral curve of the area of interest, the average spectral value is extracted to obtain three sets of 48x256 spectral data matrices


According to the hyperspectral images and spectral curves in different bands, Band 1-Band 50 has large noise and blurred images. When selecting data,
only Band 51-Band 250 (1031.11nm-1699.11nm) a total of 200 bands were modeled. The first 36 blueberry spectral values were used to establish the model,
and the last 12 were used for model testing.


2. Model establishment and analysis

The establishment of the blueberry sugar content prediction model mainly uses the partial least squares regression method (PLSR). Different spectral data get
different prediction models. Directly use the 200 bands with noise removed to model the 200 bands of spectral data for PCA dimension reduction, select the
first n principal components with a cumulative contribution rate of 99.9%, and then use PLSR modeling to select the characteristic bands for the 256 spectral
bands in the entire back area using SPA, and then use PLSR modeling to directly perform cyclic modeling on the 200 bands in the entire back area, first combining
two by two, and then using three by three combinations to model


3. Prediction model establishment

PLSR model of spectral data of some areas of the front

Prediction model:

y=8.1109+0.3989x+0.2848x+….+ 0.809x200

Where x1, x2, ..., x200 are the average spectral values of band 51-band250, and y is the sugar content of blueberries.


Using the prediction model, the spectral data of 12 blueberries were substituted to obtain the predicted sugar content values as shown in the following table


Table 1. Comparison of the predicted sugar content values and the actual sugar content values of some areas on the front of blueberries




Table 2. Predicted sugar content values and true values for the entire area of the front side of blueberries


Table 3. Predicted sugar content values and true values for the entire area on the back of blueberries



The predicted sugar content value of the prediction model obtained from the three sets of data and the curve of the actual sugar content value of blueberries


PCA was used to reduce the dimension of blueberry spectral data. The data after dimension reduction were then used for PLSR modeling. After PCA dimension reduction, the first n principal components with a total contribution rate of 99.9% were selected. Seven principal components were selected after dimension reduction of the spectral data extracted from the partial area of the front and the entire area of the front. The first 10 principal components were extracted after dimension reduction of the spectral data of the entire area of the back. The principal components selected after PCA dimension reduction were used for PLSR modeling. According to the prediction model function, the predicted sugar content values of the three sets of data were obtained.


First use PCA to reduce the dimension, and then perform PLSR modeling. According to the prediction model function, the curves of the predicted sugar content value and the actual sugar content value of the three sets of data are obtained


4. Summary


Comparing the prediction models established with different data, the correlation coefficients R between the predicted sugar content value and the true sugar

content value of the optimal band combination prediction model selected by the band cycle combination modeling are 0.54 and 0.61, respectively, which are

the largest among the models established with other band combinations, and the average relative errors are 12.6% and 11.9%, respectively, which are the

smallest among the models established with other band combinations, and the root mean square error of the test set is small. It can be concluded that the

prediction effect of the optimal model selected after the band cycle combination modeling is better than that of other band combinations.