Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2011-06-21
2011-06-21
Dorvil, Richemond (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S001000, C704S002000, C704S006000, C704S008000, C704S235000
Reexamination Certificate
active
07966173
ABSTRACT:
A system and method for restoration of diacritics includes making classification decisions regarding an utterance in accordance with an aggregate of a plurality of information sources in a diacritization model for diacritic restoration. A best diacritic representation is determined for graphemes in the utterance based upon a best match with the diacritization model. A diacritically restored representation of the utterance is output.
REFERENCES:
patent: 6092034 (2000-07-01), McCarley et al.
patent: 6108627 (2000-08-01), Sabourin
patent: 6304841 (2001-10-01), Berger et al.
patent: 6411932 (2002-06-01), Molnar et al.
patent: 6442524 (2002-08-01), Ecker et al.
patent: 7136816 (2006-11-01), Strom
patent: 7590533 (2009-09-01), Hwang
patent: 7698125 (2010-04-01), Graehl et al.
patent: 2003/0023423 (2003-01-01), Yamada et al.
patent: 2005/0015237 (2005-01-01), Debili
patent: 2005/0154580 (2005-07-01), Horowitz et al.
patent: 2005/0192807 (2005-09-01), Emam et al.
patent: 2005/0234701 (2005-10-01), Graehl et al.
patent: 2006/0069545 (2006-03-01), Wu et al.
patent: 2006/0107200 (2006-05-01), Ching
patent: 2006/0129380 (2006-06-01), El-Shishiny
patent: 2006/0184351 (2006-08-01), Corston-Oliver et al.
Ananthakrishnan, Sankaranarayanan, Shrikanth S. Narayanan and Srinivas Bangalore (2005) Automatic Diacritization of Arabic Transcripts for Automatic Speech Recognition. In Proceedings of International Conference on Natural Language Processing, Kanpur, India.
Otakar Smr z and Petr Zem'anek. 2002. Sherds from an Arabic Treebanking Mosaic. Prague Bulletin of Mathematical Linguistics, (78):63 76.
Mohamed Maamouri, Ann Bies, and Tim Buckwalter. 2004. The penn arabic treebank : Building a largescale annotated arabic corpus. In NEMLAR Conference on Arabic Language Resources and Tools, Cairo, Egypt.
A. Elgammal and M. Ismail. A graph-based segmentation and feature extraction framework for Arabic text recognition. In Proc.of 6th. Int. Conference on Document Analysis and Recognition, ICDAR 2001, pp. 622-626, Seattle, USA, Sep. 10-13, 2001.
Dimitra Vergyri and Katrin Kirchhoff. 2004. Automatic diacritization of arabic for acoustic modeling in speech recognition. In COLING 2004 Workshop on Computational Approaches to Arabic Script-based Languages, Geneva, Switzerland.
Ouersighni R. 2001. A major offshoot of the DIINAR-MBC project: AraParse, a morphosyntactic analyzer for unvowelled Arabic texts, in the proceeding of Arabic NLP Workshop at ACL/EACL.
Gal, Ya'akov, “An HMM Approach to Vowel Restoration in Arabic and Hebrew”, Proceedings of the Workshop on Computational Approaches to Semitic Languages, ACL, Jul. 2002, Philadelphia.
R. Mihalcea. 2002. Diacritics restoration: Learning from letters versus learning from words. In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics CICLing 2002, LNCS 2276, pp. 339-348, 2002.
Grace Ngai, Dekai Wu, Marine Carpuat, Chi-Shing Wang, and Chi-Yung Wang. Semantic role labeling with boosting, SVMs, maximum entropy, SNOW, and decision lists. In Proceedings of Senseval-3, Third International Workshop on Evaluating Word Sense Disambiguation Systems,Barcelona, Jul. 2004. SIGLEX, Association for Computational Linguistics.
Tufis, D., and Chitu, A. Automatic diacritics insertion in Romanian texts. In Proceedings of the International Conference on Computational Lexicography Complex'99 (Pecs, Hungary, Jun. 1999).
David Chiang, Mona Diab, Nizar Habash, Owen Rambow, and Saullah Shareef. 2006. Parsing Arabic Dialects. In Proceedings of EACL-Jan. 18, 2006.
Kirchoff et al. “Novel Speech Recognition Models for Arabic” Johns-Hopkins University Summer Research Workshop 2002.
Mihalcea. “Diacritics Restoration: Learning from Letters versus Learning from Words” 2002 pp. 339-348.
Gupta et al. “Maximum Entropy Classification Applied to Speech” 2000.
Fung et al. “A Maximum-Entropy Chinese Parser Augmented by Transformation-Based Learning” Sep. 2004.
Maison et al. “Pronunciation Modeling for Names of Foreign Origin” 2003.
Berger. “The Improved Iterative Scaling Algorithm: A Gentle Introduction” 1997.
Lau et al. “Adaptive Language Modeling Using the Maximum Entropy Principle” 1993.
Chen. “Conditional and Joint Models for Grapheme-to-Phoneme Conversion” 2003.
Mihalcea et al. “Leter Level Learning for Language Independent Diacritics Restoration” 2002.
Pan et al. “Estimation of the joint probability of the multisensory signals” 2001.
Berger et al. “A Maximum Entropy Approach to Natural Language Processing” 1996.
Charniak. “A Maximum-Entropy-Inspired Parser” 2000.
R. Nelken et al. “Arabic Diacritization Using Weighted Finite-State Transducers”, Proc. ACL-05 Workshop on Computational Approaches to Semitic Languages, Ann Arbor, Michgan, 2005; pp. 79-86.
Ya'Akov Gal, “An HMM Approach to Vowel Restoration in Arabic and Hebrew”, Proc. ACL-02 Workshop on Computational Approaches to Semitic Languages, 2002; 7 pages.
El-Imam, “Phonetization of Arabic: rules and algorithms”, Computer Speech and Language, vol. 18, 2003 pp. 339-373.
Katrin Kirchhoff et al. “Cross-Dialectal Data Sharing for Acoustic Modeling in Arabic Speech Recognition”, SRI International, Menlo Park, CA; Jan. 6, 2005; pp. 1-23.
Simard, M., “Automatic Insertion of Accents in French Text 1998,”Third Conference on Empirical Methods in Natural Language Processing, p. 1-9.
Emam Ossama S.
Sarikaya Ruhi
Zitouni Imed
Borsetti Greg A
Dorvil Richemond
Nuance Communications Inc.
Wolf Greenfield & Sacks P.C.
LandOfFree
System and method for diacritization of text does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for diacritization of text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for diacritization of text will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2645334