Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-02-22
2004-06-15
Shah, Sanjiv (Department: 2172)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000
Reexamination Certificate
active
06751605
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to a pattern recognition apparatus for recognizing input patterns and displaying recognized results. More particularly, the invention relates to a pattern recognition apparatus to which predetermined character strings such as addresses and fixed phrases, are handwritten for inputting.
A majority of applications for processing of slips, invoices and other forms by so-called pen PC's (pen-input computers) primarily involve inputting addresses and fixed phrases to the apparatus. Three representative methods have been proposed to have predetermined character strings, such as addresses and fixed phrases, entered: (1) choose from among candidates in a menu format; (2) in a menu-and-character recognition combination format, input a ZIP code to generate a menu-display of candidate addresses to choose from; (3) write by hand characters to be recognized so that their candidates are optimized by use of a word dictionary.
The method (1) above is disclosed illustratively in “Recognition of Handwritten Addresses in Unframed Setup Allowing for Character Position Displacements” (Periodical D-2 of the Institute of Electronics, Information and Communication Engineers of Japan, January, 1994). The method generally involves, given hierarchical data such as addresses, selecting candidate data successively from the top through the bottom layers of hierarchy. For example, “
(Ibaraki-ken (a prefecture in Japan))” may be followed by “
(Hitachi-shi (city))” which in turn can be followed by “
(Oomika-cho (town))”. One disadvantage of this method is that if a user is not certain whether Hitachi-shi is located in, say, Tochigi-ken or Ibaraki-ken (i.e., the prefectural or topmost category), the user has difficulty selecting illustratively Hitachi-shi.
With the method (2) above, the user need only input a ZIP code, and the system will give a menu-display of code-prompted addresses to choose from. The procedure is relatively simple so long as the user remembers all necessary ZIP codes; however, they can be difficult to memorize except probably for the user's own ZIP code.
The method (3) above allows handwritten characters to be recognized and their candidates to be optimized through the use of a word dictionary. How this method works is outlined below with reference to some of the accompanying drawings.
FIG. 3
is a schematic block diagram of a conventional character recognition apparatus. In
FIG. 3
, a handwritten pattern input through a tablet a
1
is pattern-matched with a recognition dictionary a
2
in a character recognition process a
3
. Candidate characters thus obtained are matched in words with a word dictionary a
6
in a word correlation process a
7
. Following the word matching, the applicable words are displayed on an LCD a
8
.
FIG. 4
is a schematic flow diagram showing how a conventional character recognition apparatus is used to input an address. For example, to input “
(Ibaraki-ken (prefecture))”, “
(Hitachi-shi (city))”, “
(Oomika-cho (town))”, the user writes by hand all these characters into a predetermined address input area b
1
. The handwritten characters are then recognized in the process a
3
. Candidate characters obtained from the recognition process are matched in words with the word dictionary a
6
, starting from the highest layer category (i.e., prefectural level). The candidate characters are thus optimized and the results are output as candidate characters.
Conventionally, hierarchical data such as addresses are accessed from the highest hierarchical layer down. This is because the higher the layer is in hierarchy, the smaller will be the amount of data stored so that once the highest layer candidate is determined, the lower candidates are readily inferred therefrom. But suppose that the conventional system receives a keyword “
(Oomika-cho (town))” for a search through the word dictionary. In that case, the system has no choice but to search through an the entire word dictionary which may be as large as 1.5 MB because the layer of the input keyword is unknown. This scheme is thus impractical in applications such as online character recognition where high degrees of responsiveness are required.
A typical word dictionary that stores addresses in Japan may be constituted as follows:
Prefectural names:
about 50 names×about 3 characters per name×2 bytes per character=about 300 B in capacity
Cities and towns:
about 4,000 names×about 3 characters per name×2 bytes per character=about 2.5 kB in capacity
Subordinate municipalities:
about 160,000 names×about 4 characters per name×2 bytes per character=about 1.3 MB in capacity
The total volume of data in such a representative dictionary is about 1.5 MB.
One disadvantage of the above conventional method is the chores that the user must put up with in writing by hand the entire address desired, which can be as long as, say, “
(Ibaraki-ken),
(Hitachi-shi),
(Oomika-cho)”.
One problem common to all three methods (1) through (3) outlined above is that in character recognition applications, the user is subject to the tedious task of writing by hand all character strings such as addresses and fixed phrases. Another common problem is that a search through the word dictionary for a word in any layer other than the topmost layer of hierarchy can take a very long time. A further problem is that in a menu-driven environment of a hierarchical data structure illustratively made up of addresses, lower-layer items cannot be selected unless their upward items are known.
It is therefore an object of the present invention to provide a pattern recognition apparatus for accepting only key characters (e.g, “
(Oomika)” or “
(~Mika-cho”) written by hand in order to infer the remaining character string (e.g., “
(Iaragi-ken),
(Hitachi-shi”), whereby the entire character string recognized is output (e.g., “
(Ibaraki-ken),
(Hitachi-shi),
(Oomika-cho)”).
SUMMARY OF THE INVENTION
In carrying out the invention and according to one aspect thereof, there is provided a character recognition apparatus having recognition means for recognizing input character strings and display means for displaying recognized results, the character recognition apparatus comprising: a word dictionary storing word identification information and hierarchy information for layering a plurality of words into a hierarchy and for recognizing each of the words within the hierarchy; a character transition probability table storing at least probabilities of transitions from any one character to another, and those pieces of the word identification information which correspond to combinations of characters resulting from the transitions; optimization means for using the character transition probability table in optimizing candidate character strings obtained by the recognition means; and retrieval means for searching through the word dictionary for words defined by those pieces of the word identification information which correspond to the optimized candidate character string, thereby retrieving the searched words which are identified by the applicable pieces of the hierarchy information and which have yet to be input.
When characters of a low hierarchical level such as “Oomika-cho” alone are input, the inventive character recognition apparatus outlined above first extracts “Oomika-cho” as the candidate character string optimized by the optimization means. The word dictionary is then searched for higher-level words on the basis of the word identification information corresponding to the optimized character string. The search yields yet-to-be input words “Ibaraki-ken, Hitachi-shi,” higher in hierarchy than the input “Oomika-cho.” The recognized result is “Ibaraki-ken, Hitachi-shi, Oomika-cho,” the entire character string made up of the entered and unentered words.
According to another aspect of the invention, there is provided a character recognition apparatus having recognition means for recognizing input character strings and display means for displaying recognized results,
Gunji Keiko
Katsura Koyo
Kuzunuki Soshiro
Miura Masaki
Yokota Toshimi
Antonelli Terry Stout & Kraus LLP
Hitachi , Ltd.
Shah Sanjiv
LandOfFree
Apparatus for recognizing input character strings by inference does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus for recognizing input character strings by inference, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus for recognizing input character strings by inference will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3341111