Method of adapting linguistic speech models

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S243000, C704S244000, C704S251000

Reexamination Certificate

active

06499011

ABSTRACT:

FIELD OF THE INVENTION
The invention relates to a method of adapting linguistic speech models in automatic speech recognition systems by means of speech recognition results obtained during operation of the systems, in which, during the adaptation, a list of N-best recognition result alternatives with N>1 for a speech utterance to be recognized is evaluated.
A corresponding on-line adaptation of speech models is particularly required in dialogue systems using automatic speech recognition. Such dialogue systems provide the possibility of, for example, speech-controlled data bank inquiries. Examples are train timetable information systems, telephone information systems, airport information systems and information systems for bank clients.
The speech recognition is performed by means of stochastic models. Both acoustic models, based on HMM models (Hidden Markov Model) and linguistic speech models which represent probability values of the occurrence of speech elements of a semantic and syntactic nature are used. Notably in dialogue systems, the problem often occurs that there is not enough training material available for training the linguistic speech models used for the speech recognition before the system is taken into operation. For this reason, it is desirable to provide an on-line adaptation in dialogue systems in which the speech recognition results obtained during operation are used for further improvement of the used linguistic speech model or for adaptation to the relevant fields of application. Such an adaptation is designated as being unsupervised because only the speech recognition result found is available to the speech recognizer, rather than safe information about the actually provided speech utterance.
DESCRIPTION OF PRIOR ART
It is known from S. Homma et al, “Improved Estimation of Supervision in Unsupervised Speaker Adaptation”, ICASSP 1997, pp. 1023-1026 to use the best recognition result alternative, i.e. the one having the greatest probability, only for the on-line adaptation in the case of unsupervised on-line adaptation of linguistic speech models from a list of N-best recognition result alternatives defined for a speech utterance, when the difference between this probability and the probability of the second best recognition result alternative exceeds a given predetermined threshold value.
SUMMARY OF THE INVENTION
It is an object of the invention to improve the on-line adaptation of the linguistic speech models.
This object is solved in that a combination of a plurality of recognition result alternatives of the list is included in the adaptation.
This has the advantage that a compensation is created for those cases where the element of the list of N-best recognition result alternatives evaluated as the best recognition result alternative does not correspond to the actually provided speech utterance. This will be regularly represented by at least another recognition result alternative of the list. By combining a plurality of recognition result alternatives of the list in accordance with the invention, an error-reducing compensation is achieved in such cases, which eventually leads to an improved on-line adaptation regarding linguistic speech modeling.
Particularly when recognizing sequences of single speech elements, which are combined to a speech utterance, the invention benefits from the fact that single speech elements of the actual speech utterance may not be represented in the best recognition alternative but with great probability in at least one of the other list elements of the list of N-best recognition result alternatives. In on-line adaptation, such parts of recognition result alternatives are not ignored but taken into account with a given weight. Furthermore, in cases where speech elements are represented in the best recognition result alternative of the list of N-best recognition result alternatives, which speech elements were not part of the actual speech utterance, it is very probable that such speech elements are not represented in the other list elements. Here, too, the fact that further list elements are taken into account provides a compensation for an error which would occur when only the best list element were taken into account.
The inventive idea is realized in that, in the combination of recognition result alternatives of the list, probability values assigned to these alternatives are weighted with a given numerical value, and an adaptation weight for a recognition result alternative used for the adaptation is formed in that the weighted probability value assigned to this recognition result alternative is related to the sum of the weighted probability values assigned to the other recognition result alternatives of the list. This implementation, which can easily be realized and leads to satisfactory adaptation results, is made more concrete in that the adaptation weights are defined in accordance with the formula
ω
i
=
l
i
λ

j
=
1
N

l
j
λ
in which
&ohgr;
i
is the adaptation weight relating to the i
th
element of the list of N-best recognition result alternatives, and
l
i
is the probability value of the i
th
element of the list of N-best recognition result alternatives. The weight &lgr; can be determined heuristically for each case. When the probability values l
i
are present in a logarithmic form, this formulation has the advantage that the involution with the weight &lgr; changes over to a multiplication by this weight.
The invention relates to a speech recognition system wherein a linguistic speech model used for speech recognition is adapted in accordance with any one of the methods described above.


REFERENCES:
patent: 5241619 (1993-08-01), Schwartz et al.
patent: 5606644 (1997-02-01), Chou et al.
patent: 5677990 (1997-10-01), Junqua
patent: 5712957 (1998-01-01), Waibel et al.
patent: 5737489 (1998-04-01), Chou et al.
patent: 5835890 (1998-11-01), Matsiu et al.
patent: 5983179 (1999-11-01), Gould
patent: 6076057 (2000-06-01), Narayanan et al.
patent: 6185528 (2001-02-01), Fissore et al.
“Improved Estimation of Supervision in Unsupervised Speaker Adaptation”, Shigeru Homma et al, ICASSP, pp. 1023-1026.
“A Spoken Language Inquiry System for Automatic Tain Timetable information”, by Harald Aust et al, Philips J.R. 49, 1995, pp. 399-418.
“Elements of Information Theory”, Thomas M. Cover et al, Entropy, Relative Entropy and Mutual Information, p. 18.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of adapting linguistic speech models does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of adapting linguistic speech models, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of adapting linguistic speech models will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2921453

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.