Image analysis – Pattern recognition – On-line recognition of handwritten characters
Reexamination Certificate
1996-05-23
2003-04-29
Chang, Jon (Department: 2623)
Image analysis
Pattern recognition
On-line recognition of handwritten characters
Reexamination Certificate
active
06556712
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to the field of handwriting recognition systems and methods for handwriting recognition. More particularly, in one implementation, the present invention relates to recognition of on-line cursive handwriting for ideographies scripts.
BACKGROUND OF THE INVENTION
The Chinese and Japanese languages use ideographies scripts, where there are several thousand characters. This large number of characters makes the entry by a typical computer keyboard of a character into a computer system cumbersome and slow. A more natural way of entering ideographies characters into a computer system would be to use handwriting recognition, and particularly automatic recognition of cursive style handwriting in a “on-line” manner. However, prior on-line handwriting recognition methods have concentrated on print style handwritten ideographies characters; the requirement that the handwriting be printed is still too slow for a typical user of a computer system. These prior methods have not been successful at adapting to on-line cursive style handwriting character recognition.
The complexity of the ideographies characters and the character distortion due to non-linear shifting and multiple styles of writing also makes character recognition difficult, particularly for on-line systems.
one method which has been used extensively to deal with the types of problems arising from ideographies character recognition is hidden Markov modeling (HMM). HMMs can deal with the problems of segmentation, non-linear shifting and multiple representation of patterns and have been used extensively in speech and more recently character recognition. See, for example, K. Lee “Automatic Speech Recognition; The Development of The SPHINX System”, Kluwer, Boston, 1989.; Nag, R., et al. “Script Recognition Using Hidden Markov Models”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 2071-2074, 1986; and Jeng, B., et al., “On The Use Of Discrete State Markov Process for Chinese Character Recognition”, SPIE, vol. 1360, Visual Communications and Image Processing ′90, pp. 1663-1670, (1990). Jeng used HMMs for off-line recognition of printed Chinese characters. In this system described by Jeng, one HMM is used for every Chinese character, and the HMMs are of fixed topology. The limitations of this approach are that the system can only recognize printed Chinese characters and not cursively written characters. This recognition system also requires a large amount of memory to store the thousands of character level Markov models. Another disadvantage of the system is that a fixed topology is used for every character and the number of states for a character's hidden Markov model does not depend on the complexity of the character.
In ideographies languages, such as Chinese, the thousands of ideographies characters can be broken down into a smaller set of a few hundred subcharacters (also referred to as radicals). There are several well dictionaries which define recognized radicals in the various ideographies languages. Thus, the thousands of ideographies characters may be represented by a smaller subset of the subcharacters or radicals. See, Ng, T. M. and Low, H. B., “Semiautomatic Decomposition and Partial Ordering of Chinese Radicals”, Proceedings of the International Conference on Chinese Computeing, pp. 250-254 (1988). Ng and Low designed a semiautomatic method for defining Chinese radicals. However, these radicals are not suitable for on-line handwriting character recognition using hidden Markov models for several reasons. First, to perform on-line character recognition using radical HMMs, a character model based on several radical HMMs should be formed from a time sequence of subcharacters, which was not done by Ng and Low. Secondly, Ng and Low break down the characters into four basic constructs or categories of radicals; vertical division; horizontal division; encapsulation and superimposition, and a radical as defined by Ng and Low can appear in more than one of these categories. This has the effect of having up to four different shapes and sizes for the radical and this will have a detrimental effect on the hidden Markov modeling accuracy because the model has to deal with up to four different basic patterns for the four categories.
While the use of subcharacters or radicals to recognize ideographies characters is in some ways desirable, it does not always accurately recognize characters without also recognizing the geometric layout of the subcharacters relative to each other in a character. In a prior approach by Lyon, the use of a size and placement model for subcharacters in a ideographies script has been suggested. See, U.S. patent application Ser. No. 08/315,886, filed Sep. 30, 1994 by Richard F. Lyon, entitled “System and Method for Word Recognition Using Size and Placement Models.” This method uses the relationship between sequential pairs of subcharacters in a character to create a size and placement model. The subcharacter pair models are created by finding the covariance between bounding box features of subcharacter pairs. This method relies on the pen lift which occurs between subcharacters of ideographies characters and thus is only useful for printed ideographies characters and cannot be used for cursively written ideographies characters where there is usually no pen lift between characters.
Thus the prior art while providing certain benefits for handwriting recognition does not efficiently recognize cursively written ideographies characters in an on-line manner (for example, in an interactive manner). Moreover, the use of an HMM for a radical having various categories has a detrimental effect upon the accuracy of the HMM procedures. Thus it is desirable to provide improved on-line recognition of cursive handwriting for ideographies scripts.
SUMMARY OF THE INVENTION
The present invention, in one embodiment, creates an on-line handwriting recognition system for ideographies characters based on subcharacter hidden Markov models (HMMs) that can successfully recognize cursive and print style handwriting. The ideographies characters are modeled using a sequence of subcharacter models (HMMS) and they are also modeled by using the two dimensional geometric layout of the subcharacters within a character. The system includes, in one embodiment, both recognition of radical sequence and recognition of geometric layout of radicals within a character. The subcharacter HMMs are created by following a set of design rules. The combination of the sequence recognition and the geometric layout recognition of the subcharacter models is used to recognize the handwritten character. Various embodiments of the present invention are described below.
In one embodiment of the present invention, a method of recognizing a handwritten character includes the steps of comparing a handwritten input to a first model of a first portion of the handwritten character and comparing the handwritten input to a second model of a second portion of the character, where the second portion of the character has been defined in a model to follow in time the first portion. In a typical embodiment, the first model is a first hidden Markov model and the second model is a second hidden Markov model where the second model is defined to follow the first model in time; typically the first model is processed (e.g. by a Viterbi algorithm) in the system before the second model such that the system can automatically segment the first portion of the character from the second portion of the character, which is useful in the geometric layout recognition of the present invention. In a typical example, the first portion will include a first portion of a recognized radical and the second portion will include a second portion of the same recognized radical, where the first portion is normally written first and then at least another portion of another recognized radical is written and then finally the second portion is written. In this manner, the radical HMMs re separated and order
Loudon Gareth H.
Pittman James A.
Wu Yi-Min
Apple Computer Inc.
Blakely , Sokoloff, Taylor & Zafman LLP
Chang Jon
LandOfFree
Methods and apparatus for handwriting recognition does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods and apparatus for handwriting recognition, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for handwriting recognition will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3044391