Methods and apparatus for associating character codes with...

Image analysis – Pattern recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S185000

Reexamination Certificate

active

06829386

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to coded character sets. More particularly, the present invention provides methods and apparatus for mapping a coded character set with a second coded character set associated with character attributes. The frame of reference for the present invention is a system that accesses character attributes.
The use of different character coding sets in various software environments has caused incompatibility between computer systems and code ambiguity. Different coding sets are required to represent text, mathematical, scientific, and musical symbols. Specialized character coding sets are needed, for example, to represent Chinese or Japanese characters. Furthermore, codes used to represent one character or symbol in a particular coding set often represent a different character or symbol in a another coding set. For example, some codes may represent the first byte of a two byte ideograph in a different coding set.
The growth of the Internet and the need for software that can be used in different environments and platforms has created a push for universal character coding sets. These universal coding sets contain a character set standard that can be used in many different software environments. One example of such a universal coding set is Unicode. Unicode allows assignment of characters to codes ranging from code 0×00 to code 0×10FFFF. The coding space under this definition allows Unicode to represent 1,114,112 different characters. Not surprisingly, however, many of the codes allocated in Unicode are not assigned. Unicode is described in ISO/IEC 10646-1 and is hereby incorporated by reference for all purposes. Aspects of Unicode are also described in the Unicode Technical Standard #6, available from Unicode Inc. and in Bits of Unicode by Mark Davis, available from the Unicode Consortium.
Each character in Unicode, and other universal coding sets, has a character code. Every character code is associated with a set of character attributes. Character attributes include collation weight, whether the character is printable, whether the character is upper or lower case, which character class the character belongs to, etc. The attributes associated with a character are accessed frequently. For example, when a user types a letter “b” into a computer system, the computer system examines the attributes associated with the character code for “b” to determine whether the character should be displayed on the screen. In another example, when a sort function is used to alphabetize a list of words, the attributes for each character in the list of words is examined to determine how the words are sorted alphabetically.
In Unicode, each character attribute set usually requires approximately 64 bytes of memory. Consequently, a system associating each allocated character code in Unicode with a character attribute set requires 1,114,112 times 64 bytes or 71,303,168 bytes of memory space. Due to this large memory requirement, many computer systems attempt to compress the Unicode, since many of the 1,114,112 possible character codes and character attribute sets are not used. By compressing this data, significant memory space is saved. However, decompression and compression each time a character attribute is accessed can be very inefficient. Other numeric mapping schemes can also consume valuable processing resources or additional memory space.
Each of the currently available techniques for mapping or compressing character code sets has disadvantages with regard to at least some of the desirable characteristics of accessing character attributes. It is therefore desirable to provide a system for mapping a character coding set (such as Unicode) to an optimized character coding set in which the mapping system exhibits desirable characteristics as well or better than the technologies discussed above.
SUMMARY OF THE INVENTION
According to the present invention, methods and apparatus are provided to map a character coding set to an optimized character coding set associated with an attribute set.
A system identifies a character code. This character code may be received from keyboard entry, read from memory, or acquired from an external network, for example. This character code comprises an arrangement of bytes. According to specific embodiments, each byte can be identified as a group, plane, row, or cell. The row is mapped to a corresponding row of an optimized character code. The group, plane, or cell of the character code and the optimized character code can be the same. Optionally, the plane, group, and cell are mapped to corresponding planes, groups, and cells of the optimized character code.
Each of the groups, planes, rows, and cells of character codes and optimized character codes can be a value identified by a particular arrangement of bits. In Unicode, the value of each group, plane, row, or cell is equivalent to one byte in a character code. Alternatively, the group, plane, row, or cell can be a value identified by any arrangement of bits in the character code and can be mapped to a different arrangement of bits in the optimized character code.
One aspect of the invention provides a method for mapping character codes to optimized character codes associated with character attributes. The method may be characterized by the following sequence: (1) receiving a character code having a string of bits; (2) identifying a first subset of bits in the character code, wherein the first subset of bits identifies a first row; and (3) mapping the first row to a second row associated with an optimized character code in an optimized character code index, wherein mapping the first row identifies an optimized character code for the received character code.
A second subset of bits in the character code can be mapped an identified as a first plane. The first plane can be mapped to a second plane associated with an optimized character code.
Another aspect of the invention provides an apparatus for mapping character codes to optimized character codes. The apparatus may be characterized by the following features: (1) memory; (2) an input mechanism for receiving a character code; (3) one or more processors coupled with the memory, the processors configured to identify a first subset of bits in the character code, wherein the first subset of bits identifies a first row and maps the first row to a second row associated with an optimized character code in the optimized character code index, wherein mapping the first row identifies an optimized character code for the received character code.
The one or more processor can be further configured to identify a second subset of bits in the character code, wherein the second subset of bits identifies a first plane. The one or more processor can also map the first plane to a second plane associated with an optimized character code in the optimized character code index.
Another aspect of the invention pertains to computer program products including a machine readable medium on which is stored program instructions, tables or lists, and/or data structures for implementing a method as described above. Any of the methods, tables, or data structures of this invention may be represented as program instructions that can be provided on such computer readable media
A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and the drawings.


REFERENCES:
patent: 5889481 (1999-03-01), Okada
patent: 6204782 (2001-03-01), Gonzalez et al.
patent: 6422476 (2002-07-01), Ackley
Erickson “Options for presentation of multilingual text: use of the Unicode standard”, digital information associate research project, pp. 1-24, 1997.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Methods and apparatus for associating character codes with... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Methods and apparatus for associating character codes with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods and apparatus for associating character codes with... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3317340

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.