Word phrase translation using a phrase index

Data processing: speech signal processing – linguistics – language – Linguistics – Translation machine

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S007000

Reexamination Certificate

active

06473729

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention generally relates to translating expressions from one natural language into another natural language, and in particular assisting a translator to get the right translation for any phrase.
2. Description of the Related Art
Any translator is evaluated according to two criteria: translation speed and translation quality. One difficulty affecting both of these criteria is the appearance of a word or group of words which makes the translator hesitate. Finding the suitable translation may lead to a time-consuming manual search, with no guarantee of the result.
Presently, several techniques have been developed for assisting a translator. One of these techniques involves the use of contextual dictionary look-up. Contextual dictionaries allow for getting the translation of a word according to its context. This technique is strongly limited in the extent to which translations are possible, i.e. by looking up a contextual dictionary, the translator is provided with a low number of proposed translations only.
Further, multi-lingual terminology databases exist which are based on translations of pre-accepted terms. This technique is strongly restricted to the prestored set of terms, and the translator is not assisted in translating expressions which are not part of the set of pre-accepted terms.
A further technique is based on the use of translation memory which stores already translated sentences. When a sentence has to be translated, the system queries the database and automatically proposes a translation. However, this system requires matching complete sentences, even if the matching can be fuzzy, so that this technique is again strongly restricted in its applicability.
Another translation technique has been proposed by M. Nagao, “A Framework of a Mechanical Translation between Japanese and English by Analogy Principle”, Artificial and Human Intelligence (A. Elithorn and R. Banerji, eds.), Elsevier Science Publishers, 1984, pgs. 173-180. This technique involves aligning and linguistically parsing sentences for machine translation. The parse trees from each pair of sentences are also aligned. One drawback of this technique is that such machine translation systems require performing an overall parse of the translated sentences. Another drawback is that subtrees are needed to be aligned, resulting in a considerably high computational overhead.
SUMMARY OF THE INVENTION
The present invention has been made in consideration of the above situation and has as its primary object to assist a translator to achieve an improved quality of the resulting document.
It is another object of the present invention to contribute to a controlled translation to prevent expensive manual search for unknown expressions, thereby providing functionality in addition to that of using translation memory and terminology databases.
It is still another object of the present invention to provide the translator with an easy-to-use, efficient and reliable tool which is capable of promptly replying to the translator's request for assistance.
A further object of the present invention is to be compatible with existing technology and software tools.
These and other objects of the present invention will become apparent hereinafter.
To achieve these objects, according to a first aspect, the invention provides a method for translating a word phrase from a first natural language to a second natural language. The word phrase is a group of two or more associated words. The method comprises the steps of inputting a text written in the first language; extracting a word phrase from said text; and querying a database for the extracted word phrase using a phrase index of said database. The phrase index indexes text fragments by word phrases. The text fragments represent a primary grammatical unit including at least one clause. The database contains pairs of text fragments, with each pair including a text fragment in the first language and a corresponding text fragment in the second language. A translation of said extracted phrase is then obtained based on one of the pairs of text fragments revealed during the step of querying the database.
According to a second aspect of the present invention, there is provided a computer-readable storage medium storing instructions for translating a word phrase from a first natural language to a second natural language by performing the steps according to the first aspect.
According to a third aspect of the present invention, there is provided a system for translating an input text from a natural source language to a natural target language. The system comprises storage means for storing a database containing a plurality of pairs of text fragments. The text fragments represent a primary grammatical unit including at least one clause. Each pair includes a text fragment in the source language and a corresponding text fragment in the target language. Each text fragment contains at least one word phrase. The word phrase is a group of two or more associated words. The system further comprises a phrase extractor for extracting a word phrase from a text fragment of said input text, and database retrieval means for retrieving, from said database, pairs of text fragments that contain the extracted word phrase, using a phrase index of database. The phrase index indexes text fragments by word phrases. The system furthermore comprises user interface means for allowing a user to select one of said retrieved pairs of text fragments to obtain a translation of the extracted word phrase.
According to a fourth aspect, the invention provides a method for generating a text fragment database for use in translating a word phrase from a first natural language into a second natural language. The word phrase is a group of two or more associated words. The method comprises the steps of inputting a first document containing a text written in the first language; inputting a second document containing said text written in the second language; aligning corresponding text fragments of the first and second documents; extracting word phrases from the text fragments of the first document; and generating index information on the extracted word phrases and the aligned text fragments holding the word phrases. The text fragments represent a primary grammatical unit including at least one clause.
According to another aspect of the present invention, in the methods and systems according to the first to fourth aspects, the word phrases preferably are noun phrases. Alternatively, the word phrases may also be verb phrases. In another alternative, the word phrases may be predicates involving at least one verb and one noun or adjective used as a noun.
According to still another aspect of the present invention, the primary grammatical units are sentences.
It is still another aspect of the present invention that, once pairs of text fragments have been retrieved from the database, these retrieved pairs of text fragments are presented to the translator. Alternatively, the translator is provided with proposed translations of the extracted word phrase, based on the retrieved pairs of text fragments. In either case, the translator approves a translation, and the approved translation is then used as translation of the extracted word phrase.
According to still another aspect of the invention, in the systems and methods according to the above aspects, the step of querying the database for the extracted phrase includes the step of querying the database for sub-phrases, i.e. for all word phrases partly matching the extracted phrase.
Finally, the present invention according to any of the above aspects, may involve the step of obtaining a translation by querying a terminology base in addition to the phrase-indexed text fragment database.
By using the approach of the present invention, the database is phrase-indexed. Extracted word phrases directly index whole text fragments. In preferred embodiments, the noun phrases are used to index a sentence database. The extracted noun phrases directly

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Word phrase translation using a phrase index does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Word phrase translation using a phrase index, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Word phrase translation using a phrase index will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2952052

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.