System and method for developing interactive speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S272000

Reexamination Certificate

active

06173266

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to a system and method for developing computer-executed interactive speech applications.
BACKGROUND
Computer-based interactive speech applications are designed to provide automated interactive communication, typically for use in telephone systems to answer incoming calls. Such applications can be designed to perform various tasks of ranging complexity including, for example, gathering information from callers, providing information to callers, and connecting callers with appropriate people within the telephone system. However, using past approaches, developing these applications has been difficult.
FIG. 1
shows a call flow of an illustrative interactive speech application
100
for use by a Company A to direct an incoming call. Application
100
is executed by a voice processing unit or PBX in a telephone system. The call flow is activated when the system receives a incoming call, and begins by outputting a greeting, “Welcome to Company A” (
110
).
The application then lists available options to the caller (
120
). In this example, the application outputs an audible speech signal to the caller by, for example, playing a prerecorded prompt or using a speech generator such as text-to-speech converter: “If you know the name of the person you wish to speak to, please say the first name followed by the last name now. If you would like to speak to an operator, please say ‘Operator’ now.”
The application then waits for a response from the caller (
130
) and processes the response when received (
140
). If the caller says, for example, “Mike Smith,” the application must be able to recognize what the caller said and determine whether there is a Mike Smith to whom it can transfer the call. Robust systems should recognize common variations and permutations of names. For example, the application of
FIG. 1
may identify members of a list of employees of Company A by their full names—for example, “Michael Smith.” However, the application should also recognize that a caller asking for “Mike Smith” (assuming there is only one employee listed that could match that name) should also be connected to the employee listed as “Michael Smith.”
Assuming the application finds such a person, the application outputs a confirming prompt: “Do you mean ‘Michael Smith’?” (
150
). The application once again waits to receive a response from the caller (
160
) and when received (
170
), takes appropriate action (
180
). In this example, if the caller responded “Yes,” the application might say “Thank you. Please hold while I transfer your call to Michael Smith,” before taking the appropriate steps to transfer the call.
FIG. 2
shows some of the steps that are performed for each interactive step of the interactive application of FIG.
1
. Specifically, applying the process of
FIG. 2
to the first interaction of the application described in
FIG. 1
, the interactive speech application outputs the prompt of step
120
of
FIG. 1
(
210
). The application then waits for the caller's response (
220
,
130
). This step should be implemented not only to process a received response, as shown in the example of
FIG. 1
(
140
), but also to handle a lack of response. For example, if no response is received within a predetermined time, the application can be implemented to “time out” (
230
) and reprompt the caller (step
215
) with an appropriate prompt such as “I'm sorry, I didn't hear your response. Please repeat your answer now,” and return to waiting for the caller's response (
220
,
130
).
When the application detects a response from the caller (
240
), step
140
of
FIG. 1
attempts to recognize the caller's speech, which typically involves recording the waveform of caller's speech, determining a phonetic representation for the speech waveform, and matching the phonetic representation with an entry in a database of recognized vocabulary. If the application cannot determine any hypothesis for a possible match (
250
), it reprompts the caller (
215
) and returns to waiting for the caller's response (
220
). Generally, the reprompt is varied at different points in the call flow of the application. For example, in contrast to the reprompt when no response is received during the time out interval, the reprompt when a caller's response is received but not matched with a recognized response may be “I'm sorry, I didn't understand your response. Please repeat the name of the person to whom you wish to speak, or say ‘Operator.’”
If the application comes up with one or more hypotheses of what the caller said (
260
,
270
), it determines a confidence parameter for each hypothesis, reflecting the likelihood that it is correct.
FIG. 2
shows that the interpretation step (
280
) may be applied for both low confidence and high confidence hypotheses. For example, if the confidence level falls within a range determined to be “high” (step
260
), an application may be implemented to perform the appropriate action (
290
,
180
) without going through the confirmation process (
150
,
160
,
170
). Alternatively, an application can be implemented to use the confirmation process for both low and high confidence hypotheses. For example, the application of
FIG. 1
identifies the best hypothesis to the caller and asks whether it is correct.
If the application interprets the hypothesis to be incorrect (for example, if the caller responds “No” to the confirmation prompt of step
150
), the application rejects the hypothesis and reprompts the caller to repeat his or her response (step
215
). If the application interprets the hypothesis to be correct (for example, if the caller responds affirmatively to the verification prompt), the application accepts the hypothesis and takes appropriate action (
290
), which in the example of
FIG. 1
, would be to output the prompt of
180
and transfer the caller to Michael Smith.
As exemplified by application
100
of
FIGS. 1 and 2
, interactive speech applications idare complex. Implementing an interactive speech application such as that described with reference to
FIGS. 1 and 2
using past application development tools requires a developer to design the entire call flow of the application, including defining vocabularies to be recognized by the application in response to each prompt of the application. In some cases, vocabulary implementation can require the use of an additional application such as a database application. In the past approaches, it has been time consuming and complicated for the developer to ensure compatibility between the interactive speech application and any external applications and data it accesses.
Furthermore, the developer must design the call flow to account for different types of responses for the same prompt in an application. In general, past approaches require that the developer define a language model of the language to be recognized, typically including grammar rules to generally define the language and to more specifically define the intended call flow of the interactive conversation to be carried on with callers. Such definition is tedious.
Because of the inevitable ambiguities and errors in understanding speech, an application developer also needs to provide error recovery capabilities, including error handling and error prevention, to gracefully handle speech ambiguities and errors without frustrating callers. This requires the application developer not only to provide as reliable a speech recognition system as possible, but also to design alternative methods for successfully eliciting and processing the desired information from callers. Such alternative methods may include designing helpful prompts to address specific situations and implementing different methods for a caller to respond, such as allowing callers to spell their responses or input their responses using the keypad of a touch-tone phone. In past approaches, an application developer is required to manually prepare error handling, error prevention, and any alternative methods us

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for developing interactive speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for developing interactive speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for developing interactive speech... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2434807

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.