Research on fading character information extraction and recognition based on spectral imaging

2021-12-27 20:52

In this study, a 400-1000nm hyperspectral camera is used, and the product FS13 of Hangzhou CHNSpec technology Co., Ltd. can be used for related research.


China has a large number of cultural relics, including murals, calligraphy and painting. As Chinese traditional cultural relics, they can record the spiritual and cultural life and important historical events of the ancients. Among them, the most intuitive understanding of the information carried by cultural relics is the text. Chinese civilization has a long history. Writing was found as early as the Paleolithic age. From ancient oracle bone inscriptions to modern simplified characters, writing has always been an important symbol for human beings to record events and express emotions. The ancients left us many precious cultural relics, and most of them will be described and modified in words. Exploring the information of words can better restore the real history and understand the culture of different periods. However, due to the influence of natural and human factors, the surface of some cultural relics has faded, affecting the appearance, resulting in illegible characters. Therefore, it is particularly important to extract text information in the work of cultural relics protection. Most of the traditional methods are identified by human eyes, relying on the experience of cultural relics workers, but faded words are difficult to identify by human eyes. Therefore, it is necessary to use modern science and technology to extract faded or hidden text information from cultural relics. Hyperspectral technology has the characteristics of non-contact, "Atlas integration" and wide spectral range. It can carry out deeper data retention and analysis of cultural relics. Using the unique advantages of hyperspectral, it can capture information that can not be observed by human eyes, which is of great significance for text extraction and interpretation.


The wavelength range of the hyperspectral camera used in this study is 400-1000 nm.

The data of this study are the hyperspectral data collected from traditional Chinese paintings, stone carvings and the bottom of tombs. The three cultural relics have been damaged to varying degrees, resulting in the difficulty of identifying the information on the surface of cultural relics. After data preprocessing, the fading information is extracted, and then the convolution neural network is used for character recognition, so as to provide reference for character recognition. The handwriting and background areas of the three hyperspectral data are extracted respectively. It can be seen that in addition to the handwriting, the main colors of the background in traditional Chinese painting and stone carving images are brown and white. In addition, red substances were observed in the hyperspectral data of tombs, which also served as the background area. Therefore, the experiment mainly extracts the spectral curves of these substances, selects 10 to 20 points for each substance, saves them as ASCII files, and establishes the mean spectrum. Among them, for the stone carving data with serious fading, the background and handwriting can hardly be distinguished. Only the spectral curves of several handwriting can be collected in the right half of the image, and other images can be evenly selected in the figure. As shown in Figure 5-2, (a) , (b) and (c) are the corresponding positions of the spectral curves selected from the data of traditional Chinese painting, stone carvings and tombs respectively. Orange dots are the selected handwriting areas, and blue, purple and green dots are the selected background brown, white and red areas. Fig. 5-2 (d), (E) (f) It is the spectral curve of Chinese paintings, stone carvings and tombs after averaging. It can be seen from the figure that the spectral reflectance of handwriting is low. With the increase of wavelength, the reflectance changes slightly, while the reflectance of background is generally high, and the reflectance values change greatly at different wavelengths. After hyperspectral data optimization, the recognition results are significantly improved.


The research methods were tested and analyzed. Firstly, the handwriting enhancement index is applied to three images with different fading degrees, and a good enhancement effect is obtained. Compared with the two information extraction methods commonly used in hyperspectral data processing, the visual interpretation shows that the handwriting enhancement index is more applicable to enhance the information of handwriting. Secondly, the text information is extracted by density segmentation to obtain a binary image. Due to the serious lack of strokes of stone inscriptions, the symbols of tombs are suspected to be text, while the convolution neural network is aimed at relatively complete Chinese characters. Therefore, only the morphological transformation of the extraction results of stone inscriptions and tombs is carried out to provide a variety of transformation results to assist expert recognition. Finally, the traditional Chinese painting images with more characters are cut and morphological transformed, and input into the convolution neural network to get the top three characters. A total of 17 characters are recognized, and the accuracy is 70.8%. This result proves that the character extraction and recognition method in this study is effective.