Coded data generation or conversion – Digital code to digital code converters – To or from variable length codes
Reexamination Certificate
1999-07-06
2001-01-09
Horabik, Michael (Department: 2735)
Coded data generation or conversion
Digital code to digital code converters
To or from variable length codes
C341S023000, C341S106000, C379S368000, C379S052000
Reexamination Certificate
active
06172625
ABSTRACT:
FIELD OF THE INVENTION
This invention relates to methods and apparatus for disambiguation of ambiguous data entry and it relates to compression of dictionary information for efficient storage in data entry devices such as handwriting recognition devices, reduced keyboard disambiguation devices, speech recognition devices and the like.
BACKGROUND OF THE INVENTION
In the field of data entry devices such as devices that use handwriting and speech recognition and other data entry techniques, there is a need to store extensive volumes of data to assist in recognition, disambiguation or word selection and processing. In the world of mobile computing and mobile communications, memory space is very limited or is expensive and there is a need to minimize the space occupied by such data.
In the field of data entry, arrangements are known (for example as described in U.S. patent application Ser. No. 08/754,453 of Balakrishnan, filed on Nov. 21, 1996, assigned to the assignee of the present invention and incorporated herein by reference) in which a reduced keyboard or keypad is used for character entry where each key ambiguously represents more than one character and disambiguation software is used to disambiguate a key entry to identify the probable intended key from the various ambiguous possibilities. In such a scheme, dictionary, word or n-gram data is necessary to perform the disambiguation. Large amounts of data are required to enable satisfactory disambiguation.
Data compression techniques exist for purposes such as bulk data storage. An example is gzip compression, which is suitable for compression of alphabetical text, and is explained here by way of background. The Roman alphabet comprises 26 letters a through z, which can readily be represented as a byte of eight bits of data. Eight bits of data allow one bit for a start-of-word indicator and 128 characters (2
7
). Accordingly, such a scheme has 102 unused byte values (unused in the sense of being unnecessary for coding of 26 characters). In Gzip compression, the additional 102 byte values are used to encode character pairs. By way of example if it assumed that bits
0
-
25
are used for a through z, value
26
can be assigned to mean “ba”, value
27
to mean “ca” etc., using 102 character pairs selected as the most common character pairs in the language in question (e.g. American English).
There is a need for an improved method of storage of dictionary or other data suitable for data entry disambiguation.
REFERENCES:
patent: 4817129 (1989-03-01), Riskin
patent: 5101487 (1992-03-01), Zalenski
patent: 5128672 (1992-07-01), Kaehler
patent: 5200988 (1993-04-01), Riskin
patent: 5253053 (1993-10-01), Chu
patent: 5488366 (1996-01-01), Wu
patent: 5572208 (1996-11-01), Wu
patent: 5589829 (1996-12-01), Astle
patent: 5703581 (1997-12-01), Matias
patent: 5768445 (1998-06-01), Troeller
patent: 5786776 (1998-07-01), Kisaichi
patent: 5818437 (1998-10-01), Grover
patent: 6034958 (2000-03-01), Wicklund
patent: 6091853 (2000-07-01), Otto
Balakrishnan Sreeram
Jin Guo
Bose Romi N.
Horabik Michael
Motorola Inc.
Wong Albert K.
LandOfFree
Disambiguation method and apparatus, and dictionary data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Disambiguation method and apparatus, and dictionary data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Disambiguation method and apparatus, and dictionary data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2472578