System and method for using a correspondence table to...

Data processing: speech signal processing – linguistics – language – Linguistics

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S010000

Reexamination Certificate

active

06178397

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to data compression, and more particularly to a system and method using correspondence techniques to compress a pronunciation guide.
2. Description of the Background Art
Computer Random Access Memory (RAM) and disk space are becoming more available and affordable in desktop computer systems. A typical desktop computer system currently provides on the order of sixteen megabytes of RAM and one gigabyte of hard disk memory. This increasing availability allows programmers the freedom to create application programs and data files which occupy several megabytes of computer memory. However, minimizing the size of data files remains important for optimizing system performance and use of memory resources.
To minimize storage requirements, programmers compress large data files. One type of large file is a pronunciation dictionary, which includes dictionary words for a language such as American English and dictionary phonemes (phonetic sounds) representing the pronunciation of each of the dictionary words. A typical uncompressed pronunciation dictionary occupies up to about ten megabytes of memory.
Information such as a pronunciation dictionary can be compressed using certain symbols to replace redundant data. For example, a typical compression technique assigns symbols to represent particular patterns of redundant data such as multiple zeros or ones. Multiple compression techniques may be performed successively to eliminate more redundancies and compress data further. Accordingly, a pronunciation dictionary may be compressed to around thirty percent or less of its original size.
Previous techniques for compressing pronunciation dictionaries do not take into account redundancies inherent in dictionary words and dictionary phonemes. Therefore, as an addition to other techniques for compressing a pronunciation dictionary, it is desirable to have a system and method for taking advantage of redundancies in pronunciation.
SUMMARY OF THE INVENTION
The present invention overcomes limitations and deficiencies of previous systems by providing a new system and method for compressing a pronunciation guide such as a pronunciation dictionary. The system substitutes a single symbol for some text and its pronunciation, and includes a central processing unit (CPU) and memory. The memory stores a compression system including parsing routines, a correspondence table, a matching system, a decoder table and a decoder system. The parsing routines extract a dictionary entry, which comprises a dictionary word and corresponding dictionary phonemes representing the pronunciation of the dictionary word, from an uncompressed pronunciation dictionary also stored in the memory. The correspondence table is made up of correspondence sets, each of which has a text entry, a phoneme entry representing the pronunciation of the text entry, and a set-identifying symbol (i.e., a number). The matching system attempts to find all correspondence sets that match text and phoneme combinations of the dictionary entry.
If matches are found, then the matching engine selects the best matches and adds the representative correspondence symbol set to a compressed pronunciation dictionary. If a match is not found, then the matching system considers characters silent and/or phonemes unmatched, and assigns special symbols to be added to the compressed pronunciation dictionary. The matching system adds decoder code sets to a decoder table for translating the special symbols back to characters or phonemes.
The decoder system uses the compressed pronunciation dictionary and decoder code sets to generate corresponding phonemes for selected text. These phonemes can be used in processes such as speech recognition, speech synthesis, language translation, foreign language learning, spell checking, etc.
The present invention provides a method for compressing a pronunciation dictionary. The method creates a correspondence table comprised of correspondence sets, determines which correspondence sets match a dictionary word and its corresponding dictionary phonemes, and adds the correspondence symbols as compressed data entries to a compressed pronunciation dictionary. The invention also provides a method for using the compressed dictionary and decoder code sets to generate phonemes from input text.


REFERENCES:
patent: 4779080 (1988-10-01), Coughin et al.
patent: 5333313 (1994-07-01), Heising
patent: 5530645 (1996-06-01), Chu
patent: 5649221 (1997-07-01), Crawford et al.
patent: 5673362 (1997-09-01), Matsumoto
patent: 5799276 (1998-08-01), Komissarchik et al.
patent: 5930756 (1999-07-01), Mackie et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for using a correspondence table to... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for using a correspondence table to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for using a correspondence table to... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2524510

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.