Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1998-11-03
2001-08-14
{haeck over (S)}mits, T{overscore (a)}livaldis Ivars (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S255000
Reexamination Certificate
active
06275801
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to speech recognition, and, more particularly, to a method and system for improving the execution speed of an acoustic fast match.
2. Discussion of the Prior Art
In a speech recognition system using hidden Markov models, several thousand computations are required to compare each segment of acoustics against a pre-stored word model. Carrying out these computations for all words in the stored vocabulary is prohibitive if the goal is large vocabulary real-time speech recognition on a modest amount of hardware. In such a system, a matching algorithm is required that can rapidly identify a small set of candidate words from the whole acoustic vocabulary for further evaluation in a particular time region of the decoded utterance. In speech recognition systems based on an asynchronous stack search, the acoustic fast match provides the desired rapid identification capability. The acoustic fast match represents one of the three major functional components of a speech recognition system, the other two being the detailed match and the language model.
Conventional approaches to the implementation of the acoustic fast match can be divided into two major groups, the synchronous search and the asynchronous search. Synchronous searches suffer from several disadvantages. First, all the active word models have to be stored in memory, and thus memory requirements can be prohibitive in large vocabulary systems. Second, the estimation of word beginning probabilities requires the search to be performed in the backward direction, which significantly limits the use of this method in real time applications. For a discussion of this type of approach, see Austin, S., Schwartz, et. al, “The Forward-Backward Search Algorithm”, ICASSP91, Toronto, Canada, pp. 697-700 (1991).
In the asynchronous search, for a given time region of a speakers utterance, a search is performed by computing the total acoustic score for each word in the acoustic vocabulary, one word at a time. Each word in the acoustic vocabulary is represented by its phonetic sequence (i.e. deal=“d”−“eh”−“1”). To obtain the acoustic score of a particular word from the vocabulary, the acoustic scores of all of the individual phones that collectively define that word are computed and then combined into a single wordscore. To reduce the amount of computation, the phonetic sequences that define each word in the acoustic vocabulary are organized into a tree structure.
In addition to constructing an acoustic vocabulary tree structure, further computational savings may be realized by performing a pruning algorithm. Pruning operates by recognizing that when the tree is traversed to compute word-scores, the candidate words of interest will generate the highest word scores. More particularly, in an asynchronous search, a search algorithm traverses the tree structure from a root node along a nodal path where each node in the path represents a constituent phone of the word to be scored, if the computation of a partial word score at a particular node results in a value that falls below either some absolute threshold or is low when compared to other nodes, it is apparent at that time that all words derived from this node will be low, and as a consequence, the whole subtree can be ignored. This process is called pruning.
Despite the advantages achieved by utilizing a acoustic vocabulary tree structure along with a pruning algorithm, when the acoustic vocabulary becomes very large (e.g. more than 60,000 words) the time spent in the fast match algorithm can become very significant. Generally, the efficency of the algorithm is reduced in proportion to the increased vocabulary size since the fast match complexity is directly proportional to the number of words in the acoustic vocabulary. It is therefore desirable to devise an improved fast match algorithm that eliminates or significantly reduces the effects of increased vocabulary size on the algorithm's efficiency.
SUMMARY OF THE INVENTION
The problems stated above and the related problems of the prior art are solved with the method and system according to the present invention. In a speech recognition system, a method is provided that eliminates significantly reduces the effect of increased vocabulary size on the execution speed of the fast match algorithm. In particular, the existing asynchronous tree search based fast match algorithm is enhanced by an improved pruning algorithm.
In one aspect of the invention, a method for eliminating the effect of increased vocabulary size on the speed of the fast match comprises the steps of computing an a-priori probability of occurrence for each word from an acoustic vocabulary; deriving a penalty score for each word from said acoustic vocabulary based on each words a-priori probability of occurrence in an input text; analyzing said input text to: compute a path score for each word from said input text; and combine the computed path score with the derived penalty score to form a combined score and testing the combined score against a threshold to determine top ranking candidate words to be later processed by the detailed match.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative thereof, which is to be read in connection with the accompanying drawings.
REFERENCES:
patent: 4718094 (1988-01-01), Bahl et al.
patent: 5729656 (1998-03-01), Nahamoo et al.
patent: 6058363 (2000-05-01), Ramalingam
Bahl et al., “A Fast Approximate Acoustic Match for Large Vocabulary Speech Recognition,” IEEE Transactions on Speech and Audio Processing, vol. 1, No. 1, Jan. 1993.
Novak Miroslav
Picheny Michael
F. Chau & Associates LLP
International Business Machines - Corporation
{haeck over (S)}mits T{overscore (a)}livaldis Ivars
LandOfFree
Non-leaf node penalty score assignment system and method for... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Non-leaf node penalty score assignment system and method for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Non-leaf node penalty score assignment system and method for... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2461930