Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
1999-09-24
2001-11-27
Korzuch, William (Department: 2741)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S270000, C704S256000
Reexamination Certificate
active
06324513
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to a spoken dialog system capable of performing automated answering operation in voice or speech by recognizing a speech of a speaker. More specifically, the present invention is directed to a voice automated answering apparatus capable of providing voice (speech) services such as information provisions and reservation business over the telephone to users.
2. Description of the Related Art
As man-to-machine interface techniques introduced in information systems, needs for voice (speech) interactive techniques have been more and more increased. The voice interactive techniques may realize automated answering systems capable of performing interactive operations between users and automated answering systems by way of voices. As an application system of this man-to-machine interface technique, for instance, telephone voice automated answering apparatuses are known by which various information required by users may be provided, and also various sorts of services may be carried out on behalf of operators. While these telephone voices automated answering apparatuses are popularized, 24-hour services are available, business efficiencies are increased, and man power may be reduced.
Generally speaking, such a spoken dialog system is typically arranged by a voice recognizing unit for recognizing a speech of a user, a dialog managing unit for managing an interactive operation executed between the user and this spoken dialog system, and a voice synthesizing unit for notifying an answer made by this spoken dialog system in the form of voice. Also, a vocabulary to be recognized by the voice recognizing unit is set as a recognized dictionary.
In this system, the speech (voice) recognition precision of the voice recognizing unit has a close relationship with a scale of a vocabulary to be recognized. The larger the scale of the vocabulary becomes, the higher the recognition difficulty is increased. As a consequence, when all of the words which may be predicted to be produced by the user are set as a recognized dictionary, large numbers of word recognition error will occur, and the total number of recognition operations about confirmed results to the user is increased, so that the interactive operation is carried out in a very low efficiency. Furthermore, the interactive operation between the user and the spoken dialog system cannot be continued and will be destroyed, so that the goal of the dialog of the user cannot be achieved.
As a consequence, in the conventional spoken dialog system, in order to maintain recognition precision at high degrees while executing an interactive operation, a recognized vocabulary is changed based on the context of the interactive operation, and recognized dictionaries are replaced. Thus, the recognition operation for the next speech made by the user may be prepared.
The method for changing a recognized vocabulary used under a certain interactive condition into another vocabulary which may be produced in a next speech by a user may be mainly classified into following two methods in accordance with ways how the interactive operation proceeds between the user and the spoken dialog system.
As one interactive method, there is a system initiative type interactive operation in which interactive operations proceed in such a manner which a system mainly inquires of a user and the user answers this inquiry. In this case, the system may determine the flow of the interactive operation, and the recognized vocabulary with respect to the next speech by the user is basically set with respect to each of interactive conditions when the interactive procedure is designed.
As another interactive method, there is a user initiative type interactive operation in which interactive operations proceed in such a manner that a user mainly inquires of a system and the system answers this inquiry. In this case, since free inquires are performed by the user, it is practically difficult to determine the flow of the interactive operation at the system designing stage. The recognized vocabulary with respect to the next speech by the user may be basically predicted in the dynamic manner from the context of the interactive operation, which corresponds to the histories such as the inquired content of the user and the system answer.
The prior art (will be referred to as “first prior art”) related to the method for changing the recognized vocabulary in the above-explained system initiative type interactive operation is described by Japanese Laid-open Patent Application No. Hei-9-114493, for example, entitled “Interactive Control Apparatus” as shown in FIG.
21
.
In
FIG. 21
, reference numeral
11
indicates a topic determining unit for determining a topic, and reference numeral
12
shows a recognized word predicting unit. This recognized word predicting unit
12
is equipped with a recognized word dictionary
121
, a resemble word table
122
, a word focusing table
123
, a resemble word retrieving unit
124
, and a focused word retrieving unit
135
. The resemble word retrieving unit
124
retrieves resemble words contained in the confirmed word dictionary
121
by referring to the resemble word table
122
. The focused word retrieving unit
125
checks as to whether or not a recognized word contained in a recognized word dictionary
121
is erroneously recognized with reference to the word focusing table
123
so as to retrieve such a recognized word which owns no history of erroneous recognition. Also, reference numeral
13
indicates a voice output sentence producing unit, reference numeral
14
shows a recognition control unit, and reference numeral
15
denotes a correct/incorrect judging unit.
In accordance with the first prior art shown in
FIG. 21
, the interactive operation smoothing technique is disclosed as an interactive example of the meeting room reservation service. That is, the total number of speech reissuing actions in the case that the erroneous confirmation happens to occur is reduced, and the interactive operation can be therefore carried out in a smooth way. As a concrete interactive operation, the system makes such an inquiry “KAIGI SHITSU MEI O DOZO (please let me know the name of the meeting room)” to the user. Then, the user speaks “KONA “A” DESU (It is corner “A”).” As a result of speech recognition, when the system erroneously recognizes “KONA “B” DESU (it is corner “B”),” the system confirms to the user, asking “KONA “B” DESU KA? (corner “B” is correct?),” and the user answers “IIE (No).”
In such a context of the interactive operation, the system does not urge the user to reenter the voice “MO ICHIDO OSSHATTE KUDASAI (please say it again),” but stores in advance such a resemble word into the resemble word table
122
. This resemble word may be mistakenly recognized as the word “KONA “B” (corner “B”).”
Then, for example, in such a case that the word “KONA “A” (corner “A”)” resides as the resemble word “KONA “B” (corner “B”)” in the lower-graded candidates of the recognized result, the system confirms, asking “KONA “A” DESU KA? (corner “A” is correct?).” As a result, the total number of speech reissuing operations when the erroneous recognition occurs can be reduced, and the next recognized word candidate can be quickly specified.
There are two sets of the below-mentioned methods for changing a recognized vocabulary.
First, when a system is designed, as a preset vocabulary, the following topics such as a name of a reserving person, a date, time when a meeting room is initiated for use, time when a meeting room is terminated for use, and a name of a meeting room are determined which are items required for reserving a meeting room. Also, this system is equipped with the recognized word dictionary
121
stored with a plurality of recognized words every subject so as to select such a recognized word corresponding to the candidate determined by the candidate determining unit
11
.
Furthermore, in the case that the system makes an erroneous recognition, a history
Ishikawa Yasushi
Nagai Akito
Watanabe Keisuke
Abebe Daniel
Korzuch William
Mitsubishi Denki & Kabushiki Kaisha
LandOfFree
Spoken dialog system capable of performing natural... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Spoken dialog system capable of performing natural..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Spoken dialog system capable of performing natural... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2580770