Signalling and controlling the status of an automatic speech...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Signalling and controlling the status of an automatic speech... Signalling and controlling the status of an automatic speech...

: 1999-05-17
: 2002-08-13
: Banks-Harold, Marsha D. (Department: 2654)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Application

: C704S251000
: Reexamination Certificate
: active
: 06434527
: ABSTRACT:

FIELD OF THE INVENTION
This invention relates generally to conversational dialog between a computer or other processor-based device and a user, and more particularly to such dialog without requiring push-to-talk functionality.
BACKGROUND OF THE INVENTION
Speech recognition applications have become increasingly popular with computer users. Speech recognition allows a user to talk into a microphone connected to the computer, and the computer translating the speech into recognizable text or commands understandable to the computer. There are several different types of uses for such speech recognition. In one type, speech recognition is used as an input mechanism for the user to input text into a program, such as a word processing program, in lieu of or in conjunction with a keyboard. In another type, speech recognition is used as a mechanism to convey commands to a program—for example to save a file in a program, instead of selecting a save command from a menu using a mouse.
In yet another type of use for speech recognition, speech recognition is used in conjunction with an on-screen agent or automated assistant. For example, the agent may ask the user whether he or she wishes to schedule an appointment in a calendar based on an electronic mail the user is reading—e.g., using a text-to-speech application to render audible the question through a speaker, or by displaying text near the agent such that it appears that the agent is talking to the user. Speech recognition can then be used to indicate the user's acceptance or declination of the agent's offer.
In these and other types of uses for speech recognition, an issue lies as to when to turn on the speech recognition engine—that is, as to when the computer should listen to the microphone for user speech. This is because in part speech recognition is a processor-intensive application; keeping speech recognition turned on all the time may slow down other applications being run on the computer. In addition, keeping speech recognition turned on all the time may not be desirable, in that the user may accidentally say something into the microphone that was not meant for the computer.
One solution to this problem is generally referred to as “push-to-talk.” In push-to-talk systems, a user presses a button on an input device such as a mouse, or presses a key or a key combination on the keyboard, to indicate to the user that it is ready to speak into the microphone such that the computer should listen to the speech. The user may optionally then be required to push another button to stop the computer from listening, or the computer may determine when to stop listening based on no more speech being spoken by the user.
Push-to-talk systems are disadvantageous, however. A goal in speech recognition systems is to provide for a more natural manner by which a user communicates with a computer. However, requiring a user to push a button prior to speaking to the computer cuts against this goal, so it is unnatural for the user to do so. Furthermore, in applications where a dialog is to be maintained with the computer—for example, where an agent asks a question, the user answers, and the agent asks another question, etc.—requiring the user to push a button is inconvenient and unintuitive, in addition to being unnatural.
Other prior art systems include those that give the user an explicit, unnatural message to indicate that the system is listening. For example, in the context of automated phone applications, a user may be hear a recorded voice “Press 1 now for choice A.” While this may improve on push-to-talk systems, it nevertheless is unnatural. That is in everyday conversation between people, such explicit messages to indicate that one party is ready to listen to the other is rarely heard.
For these and other reasons, there is a need for the present invention.
SUMMARY OF THE INVENTION
The invention relates to conversational dialog with a computer or other processor-based device without requiring push-to-talk functionality. In one embodiment, a computer-implemented method first determines that a user desires to engage in a dialog. Next, based thereon the method turns on a speech recognition functionality for a period of time referred to as a listening horizon. Upon the listening horizon expiring, the method turns off the speech recognition functionality.
In specific embodiments, determining that a user desires to engage in a dialog includes performing a probabilistic cost-benefit analysis to determine whether engaging in a dialog is the highest expected utility action of the user. This may include, for example, initially inferring a probability that the user desires an automated service with agent assistance. Thus, in one embodiment, the length of the listening horizon can be determined as a function of at least the inferred probability that the user desires automated service, as well as a function of the acute listening history of previous dialogs.
Embodiments of the invention provide for advantages not found within the prior art. Primarily, the invention does not require push-to-talk functionality for the user to engage in a dialog with the computer including engaging in a natural dialog about a failure to understand. This means that the dialog is more natural to the user, and also more convenient and intuitive to the user. Thus, in one embodiment, an agent may be displayed on the screen, ask the user a question using a text-to-speech mechanism, and then wait for the listening horizon for an appropriate response from the user. The user only has to talk after the agent asks the question, and does not have to undertake an unnatural action such as pushing a button on an input device or a key on the keyboard prior to answering the query.
The invention includes computer-implemented methods, machine-readable media, computerized systems, and computers of varying scopes. Other aspects, embodiments and advantages of the invention, beyond those described here, will become apparent by reading the detailed description and with reference to the drawings.

REFERENCES:
patent: 5632002 (1997-05-01), Hashimoto et al.
patent: 5652789 (1997-07-01), Miner et al.
patent: 5860059 (1999-01-01), Aust et al.
patent: 5864848 (1999-01-01), Horvitz et al.
patent: 6018711 (2000-01-01), French-St. George
patent: 6021403 (2000-02-01), Horvitz et al.
patent: 6118888 (2000-09-01), Chino et al.
patent: 6144938 (2000-11-01), Surace et al.
patent: 6269336 (2001-07-01), Ladd et al.
patent: WO 97 41521 (1997-11-01), None
U.S. application No. 09/192,001, Confidence Measure System, filed Nov. 13, 1998.
U.S. application No. 09/055,477, Method & Apparatus for Buil, filed Apr. 6, 1998.
U.S. application No. 08/684,003, Intelligent User Assistant F, filed Jul. 19, 1996.
U.S. application No. 09/197,159, Intelligent User Assistant F, filed Nov. 20, 1998.
U.S. application No. 09/197,158, Intelligent User Assistant F, filed Nov. 20, 1998.
U.S. application No. 09/197,160, Intelligent User Assistant F, filed Nov. 20, 1998.
Judea Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (ISBN 1558604790), Apr. 1997.
Eric Horvitz, Matthew Barry, Display of Information for Time-Critical Decision-Making, Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, Aug. 1995.
Eric Horvitz, Jack Breese, David Heckerman, David Hovel, Koos Rommelse, The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, Jul. 1998, Morgan Kaufmann Publishers, pp. 256-265.
David Heckerman and Eric Horvitz, Inferring Informational Goals from Free-Text Queries: A Bayesian Approach, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, Jul. 1998, Morgan Kaufmann Publishers, pp. 230-237.
Susan Dumais, John Platt, David Heckerman, Mehran Sahami, Inductive Learning Algorithms and Representations for Text Categorization, Proceedings of ACM-CIKM98, Nov. 1998.
Ben Shneiderman, Pattie

Affiliated with

Horvitz Eric

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Amin & Turocy LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Banks-Harold Marsha D.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Microsoft Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Storm Donald L.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Signalling and controlling the status of an automatic speech... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Signalling and controlling the status of an automatic speech..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Signalling and controlling the status of an automatic speech... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2894224

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure