Data processing: speech signal processing – linguistics – language – Linguistics – Translation machine
Reexamination Certificate
2007-03-13
2007-03-13
Dorvil, Richemond (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Translation machine
C704S009000
Reexamination Certificate
active
10173252
ABSTRACT:
A parallel bilingual training corpus is parsed into its content words. Word association scores for each pair of content words consisting of a word of language L1that occurs in a sentence aligned in the bilingual corpus to a sentence of language L2in which the other word occurs. A pair of words is considered “linked” in a pair of aligned sentences if one of the words is the most highly associated, of all the words in its sentence, with the other word. The occurrence of compounds is hypothesized in the training data by identifying maximal, connected sets of linked words in each pair of aligned sentences in the processed and scored training data. Whenever one of these maximal, connected sets contains more than one word in either or both of the languages, the subset of the words in that language is hypothesized as a compound.
REFERENCES:
patent: 5477451 (1995-12-01), Brown et al.
patent: 5510981 (1996-04-01), Berger et al.
patent: 5907821 (1999-05-01), Kaji et al.
patent: 6236958 (2001-05-01), Lange et al.
patent: 6885985 (2005-04-01), Hull
patent: 2002/0198701 (2002-12-01), Moore
patent: 2003/0023422 (2003-01-01), Menezes et al.
patent: 2003/0061023 (2003-03-01), Menezes et al.
patent: 2004/0098247 (2004-05-01), Moore
patent: 2004/0172235 (2004-09-01), Pinkham
I. Dan Melamed,, “Automatic Construction of Clean Broad-Coverage Translation Lexicons,” 2ndConferences of the Association for Machine Translation in the Americas (ATMA '96), 10 pages (1996).
I. Dan Melamed, “Automatic Discovery of Non-Compositional Compounds in Parallel Data,” 2ndConference on Empirical Methods in Natural Language Processing (EMNLP '97), 12 pages (1997).
K. Yamamoto et al., “A Comparative Study on Translation Units for Bilingual Lexicon Extraction,” In Proceedings of the Workshop on Data-Driven Machine Translation, 39thAnnual Meeting of the Association for Computational Linguistics, pp. 87-94 (2001).
F. Smadja et al., “Translating Collocations for Bilingual Lexicons: A Statistical Approach,” Computational Linguistics, 22(1): 1-38 (1996).
J. Kupiec, “An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora,” In Proceedings of the 31stAnnual Meeting of the Association for Computational Linguistics, pp. 17-22 (1993).
Y. Al-Onaizan and K. Knight. 2002. Named entity translation: extended abstract. In advance papers of Human Language Technology 2002, San Diego, CA. pp. 111-115.
N. Chinchor. 1997. MUC-7 named entity task definition. In Proceedings of the 7thMessage Understanding Conference.
I. Dagan and K. Church. 1997. Termight: coordinating humans and machines in bilingual terminology acquisition. Machine Translation, 12:89-107.
T. Dunning. 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61-74.
J. Kupiec. 1993. An algorithm for finding noun phrase correspondences in bilingual corpora. In Proceedings of the 31stAnnual Meeting of the Association for Computational Linguistics. Columbus Ohio, pp. 17-22.
I.D. Melamed. 2000. Models of Translational Equivalence. Computational Linguistics, 26(2): 221-249.
R. C. Moore. 2001. Towards a simple and accurate statistical approach to learning translation relationships among words. In Proceedings of the Workshop on Data-Driven Machine Translation, 39thAnnual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 79-86.
S. Richardson, W.B. Dolan, M. Corston-Oliver and A. Menezes. 2001. Overcoming the customization bottleneck using example-based MT. In Proceedings of the Workshop on Data-Driven Machine Translation, 39thAnnual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 9-16.
D. Wu. 1995. Grammarless extraction of phrasal translation examples from parallel texts. In Proceedings of TMI-95, Sixth International Conference on Theoretical and Methodological Issues in Machine Translation, Leuven, Belgium, vol. 2, pp. 354-372.
I. D. Melamed. 1995. Automatic evaluation and uniform filter cascades for inducing N-Best translation lexicons. In proceedings of the Third Workshop on Very Large Corpora, pp. 184-198, Cambridge, MA.
A. Kumano and H. Hirakawa. 1994. Building an MT dictionary from parallel texts based on linguistic and statistical information. In Proceedings of the 15thInternational Conference on Computational Linguistics, pp. 76-81, Kyoto, Japan.
W. Gale and K. Church. 1991. Identifying word correspondences in parallel texts. In Proceedings Speech and Natural Language Workshop. pp. 152-157, Asilomar, CA. DARPA.
P. Fung. 1995. A pattern matching method for finding noun and proper noun translations from noisy parallel corpora. In Proceedings of the 33rdAnnual Meeting, pp. 236-243, Boston MA. Association for Computational Linguistics.
D. Wu and X. Xia. 1994. Learning an English-Chinese lexicon from a parallel corpus. In Proceedings of the 1stConference of the Association for Machine Translation in the Americas, pp. 206-213, Columbia, MD.
Lars Ahrenberg et al., “A Simply Hybrid Aligner for Generating Lexical Correspondence in Parallel Texts,” Proceedings of COLING '98/ACL '98.
Dorvil Richemond
Saint-Cyr Leonard
Westman Champlin & Kelly
LandOfFree
Statistical method and apparatus for learning translation... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Statistical method and apparatus for learning translation..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Statistical method and apparatus for learning translation... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3763740