Speech processing system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S243000

Reexamination Certificate

active

06801891

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an apparatus and method for decoding one or more sequences of sub-word units output by a speech recognition system into one or more representative words.
2. Description of the Related Art
The use of speech recognition systems is becoming more and more popular due to the increased processing power available to perform the recognition operation. Most speech recognition systems can be classified into small vocabulary systems and large vocabulary systems. In small vocabulary systems the speech recognition engine usually compares the input speech to be recognised with acoustic patterns representative of the words known to the system. In the case of large vocabulary systems, it is not practical to store a word model for each word known to the system. Instead, the reference patterns usually represent phonemes of a given language. In this way, the input speech is compared with the phoneme patterns to generate one or more sequences of phonemes representative of the input speech. A word decoder is then used to identify possible words corresponding to the sequence or sequences of phonemes. Typically the decoding of the phoneme sequences into these word sequences is determined by comparing the phoneme sequences with Hidden Markov Models representative of the words using a lexicon.
The present invention aims to provide an alternative technique for decoding the phoneme sequences output by the recognition engine into one or more corresponding words.
SUMMARY OF THE INVENTION
According to one aspect, the present invention provides an apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising means for receiving the recognized sequence of sub-word units representative of the one or more words to be recognized and for receiving a plurality of dictionary sub-word sequences each representative of one or more known words; means for comparing sub-word units of the recognized sequence with sub-word units of each dictionary sequence to provide a set of comparison results; and means for identifying the one or more words using the set of comparison results.


REFERENCES:
patent: 4227176 (1980-10-01), Moshier
patent: 4736429 (1988-04-01), Niyada et al.
patent: 4903305 (1990-02-01), Gillick et al.
patent: 4975959 (1990-12-01), Benbassat
patent: 4980918 (1990-12-01), Bahl et al.
patent: 4985924 (1991-01-01), Matsuura
patent: 5075896 (1991-12-01), Wilcox et al.
patent: 5131043 (1992-07-01), Fujii et al.
patent: 5136655 (1992-08-01), Bronson
patent: 5202952 (1993-04-01), Gillick et al.
patent: 5333275 (1994-07-01), Wheatley et al.
patent: 5390278 (1995-02-01), Gupta et al.
patent: 5500920 (1996-03-01), Kupiec
patent: 5577249 (1996-11-01), Califano
patent: 5638425 (1997-06-01), Meador, III et al.
patent: 5640487 (1997-06-01), Lau et al.
patent: 5649060 (1997-07-01), Ellozy et al.
patent: 5675706 (1997-10-01), Lee et al.
patent: 5680605 (1997-10-01), Torres
patent: 5684925 (1997-11-01), Morin et al.
patent: 5708759 (1998-01-01), Kemeny
patent: 5721939 (1998-02-01), Kaplan
patent: 5729741 (1998-03-01), Liaguno et al.
patent: 5737489 (1998-04-01), Chou et al.
patent: 5737723 (1998-04-01), Riley et al.
patent: 5752227 (1998-05-01), Lyberg
patent: 5781884 (1998-07-01), Pereira et al.
patent: 5787414 (1998-07-01), Miike et al.
patent: 5799267 (1998-08-01), Siegel
patent: 5835667 (1998-11-01), Wactlar et al.
patent: 5852822 (1998-12-01), Srinivasan et al.
patent: 5870740 (1999-02-01), Rose et al.
patent: 5873061 (1999-02-01), Hab-Umbach et al.
patent: 5907821 (1999-05-01), Kaji et al.
patent: 5983177 (1999-11-01), Wu et al.
patent: 5999902 (1999-12-01), Scahill et al.
patent: 6023536 (2000-02-01), Visser
patent: 6026398 (2000-02-01), Brown et al.
patent: 6061679 (2000-05-01), Bournas et al.
patent: 6172675 (2001-01-01), Ahmad et al.
patent: 6182039 (2001-01-01), Rigazio et al.
patent: 6192337 (2001-02-01), Ittycheriah et al.
patent: 6236964 (2001-05-01), Tamura et al.
patent: 6243680 (2001-06-01), Gupta et al.
patent: 6272242 (2001-08-01), Saitoh et al.
patent: 6314400 (2001-11-01), Klakow
patent: 6321226 (2001-11-01), Garber et al.
patent: 6389395 (2002-05-01), Ringland
patent: 6487532 (2002-11-01), Schoofs et al.
patent: 6490563 (2002-12-01), Hon et al.
patent: 6535850 (2003-03-01), Bayya
patent: 6567778 (2003-05-01), Chao Chang et al.
patent: 6567816 (2003-05-01), Desai et al.
patent: 6662180 (2003-12-01), Aref et al.
patent: 0 597 798 (1994-05-01), None
patent: 0 649 144 (1995-04-01), None
patent: 0 649 144 (1995-04-01), None
patent: 0 689 153 (1995-12-01), None
patent: 0 789 349 (1997-08-01), None
patent: 0 849 723 (1998-06-01), None
patent: 2 302 199 (1997-01-01), None
patent: 2 331 816 (1999-06-01), None
patent: 2 349 260 (2000-10-01), None
patent: WO 98/47084 (1998-10-01), None
patent: WO 99/05681 (1999-02-01), None
patent: WO 00/31723 (2000-06-01), None
patent: WO 00/54168 (2000-09-01), None
“Classic Maximum Entropy”, John Skilling, pp. 45-52 in Maximum Entropy and Bayesian Methods, 1989.
Kai-Fu Lee, “Automatic Speech Recognition”, The Development of the SPHINX System, Kluwer Academic Publishers, pp. 28-29 (1989).
“Automatically Generated Word Pronunciations From Phoneme Classifier Output”, Schmid, et al., Statistical Signal and Array Processing, Minneapolis, Apr. 1993, vol. 4, pp. 223-226.
“Template Averaging For Adapting A Dynamic Time Warping Speech”, IBM Technical Disclosure Bulletin, IBM Corp., New York, vol. 32, No. 11, pp. 422-426.
“Creating Speaker-Specific Phonetic Templates With a Speaker-Independent Phonetic Recognizer: Implications For Voice Dialing”, Jain, et al., New York, 1996, pp. 881-884.
“Phonetic String Matching: Lessons From Information Retrieva” Sigir Forum, Association for Computing Machinery, New York, 1996, pp. 166-172.
Steve Cassidy, et al., “EMU: an Enhanced Hierarchical Speech Data Management System” Proceedings of the 6thAustralian Speech Science and Technology Corp., Adelaid, pp. 381-386 (1996).
C. Gagnoulet, et al., “MAIRIEVOX: A voice-activated information system”, 8308 Speech Communication, Amsterdam, Netherlands, pp. 23-31 (Feb. 10, 1991).
Steven Bird, et al., “Towards A Formal Framework For Linguistic Annotations”, Linguistic Data Consortium, University of Pennsylvania, version presented at ICSLP; Sydney (Dec. 1998).
Steven Bird, et al., “A Formal Framework for Linguistic Annotation”, pp. 1-37 (Aug. 13, 1999).
Martin Wechsler, “Spoken Document Retrieval Based on Phoneme Recognition”, A dissertation submitted to the Swiss Federal Institute of Technology (ETH) Zurich, pp. 2-121 (1998).
Erling Wold, “Content-Based Classification, Search, and Retrieval of Audio”, Multimedia IEEE, pp. 27-36 (Fall 1996).
Bahl et al., “A Method for the Construction of Acoustic Markov Models for Words,” Oct. 1993, IEEE Transactions on Speech and Audio Processing, vol. 1, Issue 4, pp. 443-452.
Srinivasan et al., “Phonetic Confusion Matrix Based Spoken Document Retrieval,” Proceedings of the 23rdAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 24-28, 2000, pp. 81-87.
Kobayashi, Yutaka et al., “Matching Algorithms Between a Phonetic Lattice and Two Types of Templates—Lattice and Graph”, IEEE, 1985, pp. 1597-1600.
Micca, G. et al., “Three Dimensional DP for Phonetic Lattice Matching”, Digital Signal Processing-87, pp. 547-551 (1987).
Wright, Jerry et al., “Statistical Models for Topic Identification Using Phoneme Substrings”, IEEE, pp. 307-310 (1996).
Ng, Kenney, “Survey of Approaches to Information Retrieval of Speech Messages”, pp. 1-34, Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology (Feb. 16, 1996).
Foote, J.T., “Unconstrained keyword spotting using phone lattices with application to spoken document retrieval”, Computer Speech and Language, pp. 207-224 (1997).
Ng, Kenney et al., “Subwork Unit Representations for Spoken Document Retrieval”, EUROSPEECH (1997).
Witbrock, M.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech processing system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3304329

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.