Compact encoding of multi-lingual translation dictionaries

Data processing: speech signal processing – linguistics – language – Linguistics – Multilingual or national language support

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704 2, 704 9, 704 10, 707101, 707102, 707 5, 707532, 707536, G06F 1728, G06F 1730

Patent

active

057873860

ABSTRACT:
A computerized multilingual translation dictionary includes a set of word and phrases for each of the languages it contains, plus a mapping that indicates for each word or phrase in one language what the corresponding translations in the other languages are. The set of words and phrases for each language are divided up among corresponding concept groups based on an abstract pivot language. The words and phrases are encoded as token numbers assigned by a word-number mapper laid out in sequence that can be searched fairly rapidly with a simple linear scan. The complex associations of words and phrases to particular pivot language senses are represented by including a list of pivot-language sense numbers with each word or phrase. The preferred coding of these sense numbers is by means of a bit vector for each word, where each bit corresponds to a particular pivot element in the abstract language, and the bit is ON if the given word is a translation of that pivot element. Then, to determine whether a word in language 1 translates to a word in language 2 only requires a bit-wise intersection of their associated bit-vectors. Each word or phrase is prefixed by its bit-vector token number, so the bit-vector tokens do double duty: they also act as separators between the tokens of one phrase and those of another. A pseudo-Huffman compression scheme is used to reduce the size of the token stream. Because of the frequency skew for the bit-vector tokens, this produces a very compact encoding.

REFERENCES:
patent: 4373192 (1983-02-01), Yanagiuchi et al.
patent: 4460973 (1984-07-01), Tanimoto et al.
patent: 4468756 (1984-08-01), Chan
patent: 4471459 (1984-09-01), Dickinson et al.
patent: 4502128 (1985-02-01), Okajima et al.
patent: 4551818 (1985-11-01), Sado et al.
patent: 4584667 (1986-04-01), Hashimoto et al.
patent: 4623985 (1986-11-01), Morimoto et al.
patent: 4635199 (1987-01-01), Muraki
patent: 4641264 (1987-02-01), Nitta et al.
patent: 4644492 (1987-02-01), Murata
patent: 4654798 (1987-03-01), Taki et al.
patent: 4685060 (1987-08-01), Yamano et al.
patent: 4706212 (1987-11-01), Toma
patent: 4742481 (1988-05-01), Yoshimura
patent: 4758977 (1988-07-01), Morimoto et al.
patent: 4771385 (1988-09-01), Egami et al.
patent: 4791587 (1988-12-01), Doi
patent: 4799188 (1989-01-01), Yoshimura
patent: 4862408 (1989-08-01), Zamora
patent: 4864502 (1989-09-01), Kucera et al.
patent: 4864503 (1989-09-01), Tolin
patent: 4870402 (1989-09-01), DeLuca et al.
patent: 4870610 (1989-09-01), Belfer
patent: 4882681 (1989-11-01), Brotz
patent: 4890230 (1989-12-01), Tanoshima et al.
patent: 4912671 (1990-03-01), Ishida
patent: 5020021 (1991-05-01), Kaji et al.
patent: 5023786 (1991-06-01), Kugimiya et al.
patent: 5523946 (1996-06-01), Kaplan et al.
EPO399533A2 to Joshiba, Machine Translation System And Method of Machine Translation.
EPO410449A2 to Junich A Dictionary Apparatus Which Stores Entries And Dictionary Information.
"Word-based Text Compression" Moffat, Software-Practice and Experience, vol. 19(2), 185-198 (Feb. 1989).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Compact encoding of multi-lingual translation dictionaries does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Compact encoding of multi-lingual translation dictionaries, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Compact encoding of multi-lingual translation dictionaries will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-34422

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.