Translation method and system

Data processing: speech signal processing – linguistics – language – Linguistics – Translation machine

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S007000

Reexamination Certificate

active

06182027

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to a translation system for which high speed processing is required, and in particular to a translation method and system for improving the accuracy of the selection of an appropriate word in machine translation, without incurring any deterioration of the processing efficiency.
2. Description of the Related Art
As a consequence of the World Wide Web (WWW) expansion of the Internet, opportunities for using documents expressed in foreign languages have increased. Further, since many users desire to scan documents in their native languages, there is a growing demand for low priced machine translation software. However, the quality of the text provided by current machine translation software is unsatisfactory, and there are many translation errors.
Since for a connection on the Internet a translation system must initiate a translation process in real time, high speed processing is required and the performance of complicated procedures, such as deep semantic analysis, is difficult. Generally, therefore, such a system is equipped with a dictionary to reduce the number of unknown words, and for document scanning, more or less ambiguous translations are prepared that are at least prevented from straying to far from the point. To avoid complicating the process and to increase the accuracy of a translation, the data structure of such a dictionary tends to be relatively simple, and word translations tend to be registered not only as individual word units (e.g., a single word dictionary), but also as compound word units (e.g., a compound word dictionary). During a translation, since the simple data structure has a poor word selection function, when there are words for which the translation is registered by the units of compound words, the selection of the translation registered for the compound word unit frequently results in a better translation.
Further, in general isolated translation of individual sentences is performed. As a result, for a specific word that is repeatedly used in a plurality of locations in the same text, there may also be given a plurality of translations. For one location, a translation may be selected from an entry in a single word dictionary, and for another location a translation may be selected from an entry in a compound word dictionary.
To resolve this problem, according to a machine translation method disclosed in Japanese Unexamined Patent Publication No. Hei 3-135666, in a translation process information concerning a translation that is obtained as the result of a dictionary search is saved in a main memory, and is re-used for the same word, so that the time spent searching a dictionary located in an auxiliary storage device is saved and so that the translation of the word is consistent. With this method, however, when an incorrect translation is first selected for a word, the incorrect translation is used in all the locations in a document at which that word appears.
For a method employed for the processing of a plurality of sentences, which is disclosed in Japanese Unexamined Patent Publication No. Hei 2-228765, for a document consisting of a plurality of sentences, the inherent ambiguity of each sentence is calculated and translation is initiated for the least ambiguous sentence. The results obtained for a polysemous word in a preceding sentence are used for a succeeding sentence in order to increase the accuracy in the selection of an appropriate translation and in order to provide a consistent translation. This method, however, is premised on the assumption that a translation will be output after all the sentences in the document have been processed, and thus it can not be employed for a process by which sentences are successively translated from the beginning of a document, as when a translation process is initiated in real time while a system is connected to the Internet.
SUMMARY OF THE INVENTION
In view of the foregoing and other problems of the conventional methods and structures, an object of the present invention is to provide a machine translation method and system that together improve the accuracy of the selection of appropriate words, without incurring any deterioration of the processing efficiency.
It is another object of the present invention to provide a translation method and system that, only when a sentence for translation is selected by a user and without requiring the employment of a complicated process, automatically examines the definitions of words to select preferred words and can thus improve the accuracy of a translation.
It is an additional object of the present invention to provide a translation method and system that can translate words in consonance with context, without requiring a complicated process, such as a grammatical description process.
It is a further object of the present invention to provide a system that, for candidate words, accumulates preference information, which is obtained as a discourse dictionary during the translation of a document, and employs that information as a personal dictionary to automatically study the preferences of candidate words.
To achieve the above objects, during the translation of a document by using a compound word dictionary, elemental word information of an applied compound word is registered in a discourse dictionary, and to translate a word in the document that is not defined in the compound word dictionary, a plurality of dictionaries, including the discourse dictionary, are employed. More specifically, the following methods are provided.
When a compound word dictionary is employed to translate a document, the elemental word information of an employed compound word is registered in a discourse dictionary.
Elemental words, their candidate words, and preferences for these candidate words are described as the elemental word information for a compound word to be described in the discourse dictionary.
A candidate word selection method is provided whereby, to determine a translation for elemental words of a compound word to be described in a discourse dictionary, a candidate translation obtained from a single word dictionary for the elemental word is compared with a candidate translation for the compound word, and the candidate word that has the most nearly identical character string portion is selected. Further provided is a registration adequacy determination method whereby registration of a compound word in a discourse dictionary is canceled when the ratio of the identical character string portion in the candidate word does not exceed a threshold value.
A numeral value, which is obtained by multiplying the word length of a compound word by the ratio of the identical character string portion of the candidate word to that of the compound word, is employed as a preference for the candidate word for the elemental word. When the same candidate word has been registered with the same headword, a new preference value is added to a preference value that has already been provided.
When a specific sentence is to be translated, a discourse dictionary is referred to for a word for which a compound word dictionary cannot be employed, and if a headword exits, a registered candidate word to which the highest preference is given is selected.
A discourse dictionary including units of translated sentences, and a plurality of discourse dictionaries, which have been prepared for various sentences translated by a specific person, are merged to create an automatic learning personal dictionary.
A method for adjusting the learning function of a personal dictionary also is provided, whereby the preferences in an updated discourse dictionary are first selected to merge a plurality of discourse dictionaries.
A dictionary employment method also is provided whereby, when a compound dictionary is not employed to translate a specific document, a discourse dictionary is referred to first, and then an automatic learning personal dictionary is referred to.
Further, a method is provided whereby, to translate a specific docum

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Translation method and system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Translation method and system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Translation method and system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2437900

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.