Study on Key Techniques of Dynamic Text Recognition System

Key techniques on dynamic text recognition system are studied in this paper. Rectangularization algorithm is adopted for text localization. Vertical projection based on a priori knowledge is used to achieve twice character segmentation. Coarse mesh feature and 13-point features are used and support vector machine is adopted as the final classifier. A dynamic text recognition system is developed on the basis of those key techniques and can be used to logistics, banking, postal and other similar areas.


Introduction
Dynamic text recognition is a research hot spot in the field of text recognition, involving image processing, data mining, pattern recognition and other interdisciplinary subjects. Dynamic text recognition is applied more and more widely in many fields, such as logistics, transportation, postal service, bank and automated office. In this paper, the key technology in the process of dynamic text recognition is studied, and a dynamic text recognition software system is developed, which can be used in the fields of logistics, banking, post and other fields.

Key Techniques on Dynamic Text Recognition
In the process of dynamic text recognition, the text location, character segmentation, feature extraction and recognition algorithm are four important aspects. The selection and research of the four aspects play a very important role in improving the system recognition effect.

Text Location
Commonly used grayscale image text location methods include Hough transform [1], mathematical morphology [2], edge detection [3] and so on. In order to increase the reliability of the detection, this paper uses the Soble operator to get the edge image. Then the redundant edge is removed by the morphological operation, and the text location is obtained by using the rectangularization algorithm [4].
In order to find the largest rectangle of a connected domain, the rectangularization transformation is to transform the irregular connected domain into the rectangular connected domain. Contour tracking is used to find rectangles, and then we judge whether the rectangle area meets the requirements based on prior knowledge. The steps of the rectangularization algorithm are as follows: (1) Read a binary image; (2) Scanning each pixel of the image, if the current pixel grayscale value is 255 and all the grayscale values of the four points on the top and bottom of the pixel are 0, the gray value of the pixel is 0; if the current pixel grayscale value is 0, and the sum of the grayscale value of the upper, lower, left and right four pixels is at least 510, the grayscale value of the pixel is 255.
(3) Loop step (2), so that all irregular connected domains can be transformed into rectangular connected domains.

Character Segmentation
The first task for character segmentation is to revise incline, then remove the edges and denoise. Finally, the single character is segmented.

Incline Revise
The purpose of incline revise is to adjust the position and posture of the characters to a reasonable state so that the text can be effectively segmented. The revise of a character image can be divided into three steps: incline judgment, incline information extraction and rotation revise.
(1) Get the text rectangle by using the text localization algorithm mentioned above.
(2) Contour detection is used to detect the contour of the four sides and store the detected contour.
(3) The quadrangle most matched with the character area is selected from the quadrangles detected based on the prior knowledge, and the oblique angle of the quadrangle is extracted, and this angle is the oblique angle of the character area.
(4) Rotate the character image with the extracted oblique angle to complete the correction.

Edge Remove and Denoise
The corrected characters contain external edge and noise which should be removed. In order to improve the speed of the algorithm, line scan method is used to remove the edges. The basis of line scanning is that there will be a large number of gray jumps in the character block area because of the existence of the character sequence. In the binary image, the exact edge can be judged by the statistics of whether the number of edges is reached to a certain threshold or not. The detail steps of this method are as follows: (1) Scan the image line from up and down row by row. When the frequency of the scan to a line is higher than a certain threshold, it is considered to find the upper edge of the block area, stop scanning, and record the top row number R1.
(2) Scan the image from bottom to top as step 1, and record the bottom row number R2. At the same time we obtain the height of the image H=R2-R1; (3) Use the top and bottom rows obtained in the first two steps to obtain an image which edges outside those two rows removed.
In order to improve the accuracy of segmentation, morphological operation and median filter should be performed to made the contour of the characters' smooth.

Single Character Segmentation
The characters can be segmented after edge remove and denoise. The common character segmentation methods include vertical projection method [5], template matching method [6], connected area method [7], cluster analysis [8] and so on. In practice, a single segmentation method is generally not enough. Several suitable improved algorithms are often selected for different applications. The system to be developed in this paper is mainly used in the indoor environment of logistics, bank, post and so on. The background is relatively simple. In this paper we will first make rough positioning through the vertical projection method, and then judge the rough location according to the prior knowledge. Finally, the twice segmentation of characters is realized by judging, dividing, merging and discarding respectively. The detailed steps are as follows: (1) Perform vertical projection on binary images, the number of white (white characters on the black background) pixels in each column is counted and be put into the variable nCount[i]. The length of the variable is number of column of the image.
(2) Set a threshold of T, generally selecting a small integer (such as 1, 2, etc.), starting from the first data of the array, and comparing the value of nCount[i] with T. Getting the first pixel greater than or equal to the T position, corresponding to the subscript of the array nCount[i], is recorded as Start1=i1; Go on comparing till the value of nCount[i] is less than or equal to T, the subscript of the array is recorded as End1=i2. Till now the first character area is found.
(3) Let Width1=End1-Start1, and let Ratio (i) =Width (i) /H be the width to height ratio of the ith segmented character. From prior knowledge we know the standard width to height ratio is 1:2; (4) After the width to height ratio of the first character is obtained, the judgment can be made as follows: (1)

Feature Selection
Feature selection is also one of the key steps in the dynamic text recognition process, which requires that the selected features have good stability and easy to be separated. The commonly used feature extraction methods include 13-points feature extraction method [9], coarse grid feature extraction method [10] and PCA dimension reduction method [11].
Coarse grid feature extraction method [10] is proved have good stability and separability and being adopted in our dynamic character recognition system [12].

Recognition Algorithm
The commonly used text recognition algorithms include template matching classifier, probability statistics based Bias classifier, geometric classifier, neural network classifier and classifier based on support vector machine (SVM) [13] [14].
SVM has good performance on classifying either characters or numbers [12], especially the number of sample is limited. So in our recognition system SVM is adopted as classifier.

The Hardware Composition of the System
The hardware of the recognition system designed in this paper includes the image acquisition card, the CCD industrial camera, the industrial level security protection computer, and the software platform for the Windows operating system.

Developing Environment
OpenCV is an excellent computer vision open source library issued by Intel [15]. In this paper the system is developed under Windows operating system, VS2005 compiler and C/C++ language, and based on OpenCV2.0 open source library.

System Module Composition
The system mainly includes preprocessing, image manipulation, image transformation, recognition process, character recognition, and help module. The preprocessing part can perform image enhancement, and denoising. Image manipulation includes edge detection, converting to binary image and morphological operation. The image transformation is mainly homomorphic filtering, and the recognition process includes single character feature extraction, character segmentation, location matting and so on; character recognition modules include template matching recognition, support vector machine recognition and other algorithms. Through the processing of text location, character segmentation, feature extraction, recognition algorithm and so on, the character recognition data in text format is finally obtained for the operator to further manage the product information.
The system designed in this paper can be used in the field of dynamic text recognition, such as logistics, banking, postal service, oil pipe pole identification, etc., and has broad application prospects.