Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition
Reexamination Certificate
1999-10-12
2002-05-28
Dorvil, Richemond (Department: 2748)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Recognition
C704S273000, C379S913000, C379S902000, C379S907000
Reexamination Certificate
active
06397182
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to a method and a system for generating a speech recognition dictionary based on greeting recordings in a voice messaging system. The invention finds practical applications in telephone systems, such as Private Branch Exchange (PBX) systems, also called “Key systems” that have a voice messaging capability and also speech recognition functions, such as the ability to connect a caller to a subscriber of the telephone system (called party) by recognizing the name of the subscriber uttered by the calling party.
BACKGROUND OF THE INVENTION
Modern telephony brings to consumers a broad range of enhanced functions above the basic telephone service such as the ability to establish a communication link between taco remote locations in a network. Specific examples of such enhanced call-related functions include speech recognition, and voice messaging, among many others. An example of speech recognition services that are available today is the ability of a telephone system, such a PBX system, to effect a connection when the caller utters the name of the subscriber he/she wishes to call. The telephone system uses a speech recognition unit which processes the signal derived from the spoken utterance and tries to match this utterance to vocabulary items in a speech recognition dictionary. The vocabulary items in the speech recognition dictionary are representations of the names of the subscribers serviced by the telephone system. When the speech recognition unit finds the best match to the spoken utterance, the connection with the subscriber associated with the chosen vocabulary item is effected either immediately or after completion of a confirmation dialogue with the caller.
During the commissioning phase of the telephone system, the speech recognition dictionary is built. Typically, a text-to-transcription unit processes orthographic representations of vocabulary items associated to respective subscriber names. For each vocabulary item, the text-to-transcription unit outputs at least one transcription indicative of the pronunciation of the vocabulary item. Each transcription is comprised of a plurality of sub-word units, each sub-word unit being associated to a respective speech model. Typically, a speaker independent model set trained on the basis of a plurality of speakers is used.
A deficiency of the above-described method is that variations in pronunciations of the subscriber names are not usually provided by the text-to-transcription unit. This problem is particularly noticeable when a subscriber's name is in a language of origin different than that supported by the text-to-transcription unit. In such situations, the pronunciation derived by the text-to-transcription unit may not properly describe the actual pronunciation of the subscriber name. Consequently, the recognition performance for such name is poor.
Against this background it is clearly apparent that there exists a need in the industry to provide an improved method and a system to generate a speech recognition dictionary particularly, for use in the context of telephone systems that offer speech recognition services to users.
SUMMARY OF THE INVENTION
The invention provides a system and a method for generating a speech recognition dictionary by making use of the audio greetings recorded by telephone system subscribers. The audio greetings are played before allowing callers to leave messages in a voice mailbox of subscribers. An individual greeting is audio information that contains the name of the subscriber. This audio information can be processed to generate a transcription indicative of a pronunciation of a vocabulary item in a speech recognition dictionary representative of the subscriber name.
In a specific example of implementation, the individual greeting is an identification message consisting essentially of a signal representative of the name of the subscriber.
Advantageously, using an individual greeting to generate a transcription associated to a vocabulary item allows the speech recognition dictionary to capture a pronunciation of the subscriber name as he would pronounce himself.
In a specific example of implementation, the telephone system is a PBX system including a speech recognition unit capable to effect a connection when a caller utters the name of the called party (subscriber). The speech recognition process is effected based on the speech recognition dictionary containing the vocabulary items representative of the subscriber names, which have been generated from the individual greetings. As a variant, the vocabulary items are further associated to alternative pronunciations of the vocabulary items derived on a basis of the orthographic representation of the subscriber name as well as text to phoneme rules.
The present invention allows the generation of a speech recognition dictionary when the individual greetings are available.
The invention also extends to a telephone system with voice messaging capability that can generate a speech recognition dictionary from the audio greetings of the subscribers.
REFERENCES:
patent: 5822405 (1998-10-01), Astarabadi
patent: 5892814 (1999-04-01), Brisebois
patent: 5894504 (1999-04-01), Alfred et al.
patent: 5991723 (1999-11-01), Duffin
Elvira J M Et Al: “Name dialing using final user defined vocabularies in mobile (GSM and TACS) and fixed telephone networks” Proceedings of the 1998 IEEE International Conference On Acoustics, Speech and Signal Processing, ICASSP '98 (CAT. No. 98CH36181), Seattle, WA, USA, 12-1, pp. 849-852 vol. 2, XP002164537 1998, New York, NY, USA, IEEE, USA ISBN: 0-7803-4428-6.
Ramabhadran B Et Al: “Acoustics-Only Based Automatic Phonetic Baseform Generation” Seattle, WA, May 12-15, 1998, New York, NY: IEEE, US, vol. Conf. 23, May 12, 1998, pp. 309-312, XP000854577 ISBN: 0-7803-4429-4.
Deshmukh N Et Al: “Automated generation of N-best pronunciations of proper nouns” 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (CAT. NO.96CH35903), Atlanta, GA,USA,7-10M, pp. 283-286, vol. 1, XP002164538 1996, New York, NY, USA, IEEE, USA ISBN: 0-7803-3192-3.
Cucchiarelli A Et Al: “A statistical technique for bootstrapping available resources for proper nouns classification” Proceedings 1999 International Conference On Information Intelligence and Systems (CAT. No.PR00446), Bethesda, MD, USA, Oct. 31-Nov. 3, 1999, pp. 429-435, XP002164539 1999, Los Alamitos, CA, USA, IEEE Comput. Soc, USA ISBN: 0-7695-0446-9.
Search report for European application No. 00650132.4-2218-.
Cruickshank Brian
Forgues Pierre M.
Lin Lin
Dorvil Richemond
Nolan Daniel A.
Nortel Networks Limited
LandOfFree
Method and system for generating a speech recognition... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for generating a speech recognition..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for generating a speech recognition... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2908548