Method and apparatus for adapting the language model's size in a

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Patent

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and apparatus for adapting the language model's size in a Method and apparatus for adapting the language model's size in a

: 1997-09-25
: 1999-05-04
: Hudspeth, David R.
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: 704257, 704248, 704238, G10L 900
: Patent
: active
: 058999736
: DESCRIPTION:

BRIEF SUMMARY
TECHNICAL FIELD

The present invention concerns speech recognition systems being implemented in a digital computer, or speech recognition devices like dictaphones or translation devices for telephone installations. In particular, the invention is directed to a mechanism for decreasing the size of the statistical language model in such speech recognition systems in order to reduce the needed resources, such as storage requirements, to process such systems. The language model's size can be also adapted to the system environment conditions or user specific speech properties.

BACKGROUND OF THE INVENTION

In speech recognition systems being based on a statistical language model approach instead of being knowledge based, for example the English speech recognition system TANGORA developed by F. Jelinek et al. at IBM Thomas J. Watson Research Center in Yorktown Heights, USA, and published in Proceedings of IEEE 73(1985)11, pp.1616-24), entitled "The development of an experimental discrete dictation recognizer", the recognition process can be subdivided into several steps. The tasks of these steps depicted in FIG. 1 (from article by K. Wothke, U. Bandara, J. Kempf, E. Keppel, K. Mohr, G. Walch (IBM Scientific Center Heidelberg), entitled "The SPRING Speech Recognition System for German", in Proceedings of Eurospeech 89, Paris 26.-28.IX.1989), are signal by a signal processor; to produce the observed label sequence; the language by means of a statistical language model.
The whole system can be either implemented on a digital computer, for example a personal computer (PC), or implemented on a portable dictaphone or a telephone device. The speech signal is amplified and digitized, and the digitized data are then read into a buffer memory contained for example in the signal processor. From the resulting frequency spectrum a vector of a number of elements is taken and the spectrum is adjusted to account for an ear model.
Each vector is compared with a number of (say 200) speaker dependent prototype vectors. The identification number which is called an acoustic label, of the most similar prototype vector, is taken and sent to the subsequent processing stages. The speaker dependent prototype vectors are generated from language specific prototype vectors during a training phase for the system with a speech sample.
The fast acoustic match determines for every word of a reference vocabulary the probability with which it would have produced the sequence of acoustic labels observed from the speech signal. The probability of a word is calculated until either the end of the word is reached or the probability drops below a pre-specified level. The fast match uses as reference units for the determination of this probability a so-called phonetic transcription for each word in the reference vocabulary, including relevant pronunciation variants, and a hidden Markov model for each allophone used in the phonetic transcription. The phonetic transcriptions are generated by use of a set of phoneticization rules (l.c.)
The hidden Markov model of an allophone describes the probability with which a substring of the sequence of acoustic labels corresponds to the allophone. The Markov models are language specific and the output and transition probabilities are trained to individual speakers. The Markov model of the phonetic transcription of a word is the chain of the Markov models of its allophones.
The statistical language model is one of the most essential parts of a speech recognizer. It is complementary to the acoustic model in that it supplies additional language-based information to the system in order to resolve the uncertainty associated with the word hypothesis proposed by the acoustic side. In practice, the acoustic side proposes a set of possible word candidates with the probabilities being attached to each candidate. The language model, on the other hand, predicts the possible candidates with corresponding probabilities. The system applies maximum likelihood techniques to find the most probable candidate out of these two sets of

REFERENCES:
patent: 5072452 (1991-12-01), Brown et al.
patent: 5127043 (1992-06-01), Hunt et al.
patent: 5444617 (1995-08-01), Merialdo
patent: 5680511 (1997-10-01), Baker et al.
patent: 5710866 (1998-01-01), Alleva et al.

Affiliated with

Bandara Upali

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kunzmann Siegfried

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Lewis Burn L.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Mohr Karlheinz

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Hudspeth David R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

International Business Machines - Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tassinari, Jr. Robert P.

Representative

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wieland Susan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for adapting the language model's size in a does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for adapting the language model's size in a, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for adapting the language model's size in a will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-1867058

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure