Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
2000-06-19
2003-11-25
Knepper, David D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
Reexamination Certificate
active
06654722
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to the field of voice recognition and more particularly to a speech application for use in a Voice over IP protocol network.
2. Description of the Related Art
LAN telephony, which means “the integration of telephony and data services provided by packet-switched data networks,” is the technology that takes person-to-person communication to a high new level and associated costs to a lower level. LAN telephony enables a more flexible and cost-efficient use of many applications, for example automated call distribution, interactive voice response, voice logging, etc. This is in contrast to the relatively limited integration offered by the current voice/data integration paradigm, computer-telephony integration in which voice traffic is kept separate from data traffic and carried over circuit-switched links. Whereas the old paradigm for integrating data and voice has been to use the circuit-switched telephony fabric for data communications, the obvious drawbacks of the relatively low bandwidth available to data traffic, the inefficiency of circuit-switched data communications due to the “bursty” nature of data traffic, and the limited voice/data integration possibilities have led to present topologies in which IP data servers are bundled with proprietary PBXs or voice circuit switches in order to provide a loose integration between circuit and packet-switched networks and voice is carried by the circuit-switched network.
One of the most common uses of LAN telephony is in the enterprise Internet/Intranet environment, referred to as IP telephony. The Voice over IP (“VoIP”) protocol is the protocol upon which voice traffic can be transmitted across IP networks. In a VoIP network, analog speech signals received from an analog speech audio source, for example a PSTN or a microphone, are digitized, compressed and translated into IP packets for transmission over an IP network. Several well-known protocols implement the VoIP protocol specification including H.323, Session Initialization Protocol (“SIP”) and Master Gateway Control Protocol (“MGCP”).
A common application for IP telephony is the integration of voice mail (“v-mail”) and electronic mail (“e-mail”). Another application can include voice logging by financial or emergency-response organizations. Additionally, automated call distribution (“ACD”) can be facilitated whereby an ACD server performs value-based queuing of incoming telephone calls. Finally, interactive voice response systems can incorporate IP telephony in which responses are preprogrammed in a server as a workflow component. Still, speech recognition and speech synthesis applications (“speech applications”) have lagged in the use of IP telephony.
In particular, speech applications operate on real-time audio signals which cannot tolerate latencies associated with traditional data communications. As such, where speech applications have been incorporated in an IP telephony topology, the speech applications have been closely integrated with IP telephony server in order to preclude a negative impact from network based latencies. Accordingly, the design and development of such IP telephony enabled speech applications have been closely linked to the proprietary nature of the IP telephony server.
The tight linkage between the speech application and the IP telephony server substantially limits both the design and the extensibility of the speech application. Specifically, in the present paradigm the speech application design must incorporate functionality directly related to the chosen protocol for transporting packetized voice data to a speech recognition system and from a speech synthesis system. The development of a superior voice transport protocol, by nature of the tight linkage between the IP telephony server and the speech application, can compel the redesign of the speech application. Accordingly, there exists a need for a speech a VoIP-based speech system in which the design and implementation of the speech application remains separate from the design and implementation of the IP telephony system.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a VoIP-based speech system in which the design and implementation of the speech application remains separate from the design and implementation of the IP telephony system. It is a further object of the present invention to provide a VoIP-enabled speech server which can receive audio input from the IP telephony system over a VoIP network. It is yet another object of the present invention to provide a method for coupling a speech application to a telephony gateway server in a VoIP network. Finally, it is an object of the present invention to provide each of the VoIP-based speech system, the VoIP-enabled speech server and the method for coupling the speech application to the telephony gateway server using standards-based interfaces to the VoIP network, the t server and the speech application.
These and other objects of the present invention are accomplished in a VoIP-based speech system including: a VoIP telephony gateway server; at least one speech server, each speech server containing a VoIP-enabled speech application; a VoIP-compliant call control interface between the VoIP telephony gateway server and the speech server; and, a VoIP communications path between the VoIP telephony gateway-server and the speech application in the at least one speech server. In the VoIP-based speech system, the VoIP telephony gateway server and the speech application can establish the VoIP communications path through the VoIP-compliant call control interface.
In operation, the VoIP telephony gateway server can receive audio signals from a telephony interface, digitize the audio signals into digitized audio data, compress the digitized audio data into VoIP-compliant packets, and transmit the VoIP-compliant packets to the speech application in the at least one speech server through the VoIP communications path using the VoIP protocol. Correspondingly, the speech application can receive the VoIP-compliant packets, reconstruct the digitized audio data from the VoIP-compliant packets, and speech-to-text converting the digitized audio data. In addition, the speech application can synthesize text into digitized audio data, encapsulate the digitized audio data in VoIP-compliant packets and transmit the VOIP-compliant packets through the VoIP communications path to the VoIP telephony gateway server. Subsequently, the VoIP telephony-gateway. server can receive the VoIP-compliant packets, reconstruct the digitized audio data from the VoIP-compliant packets, and transmit the digitized audio data through the telephony interface.
In one aspect of the present invention, the VoIP telephony server can include a telephony interface and a VoIP Gatekeeper. The VoIP Gatekeeper can receive a voice call through the telephony interface, and responsively, the VoIP Gatekeeper can choose a speech server from among the speech servers. Once a speech server has been chosen, the VoIP Gatekeeper can alert the VoIP-enabled speech application in the chosen speech server that the voice call has been received.
In another aspect of the present invention, the speech server can include a speech recognition engine; a text-to-speech engine; a call control interface for establishing a voice call connection through the VoIP telephony gateway server; and, an audio data path. Notably, the audio data path can stream audio data through the established voice call connection to the speech recognition engine. Similarly, the audio data path can stream audio data through the established voice call connection from the text-to-speech engine.
In yet another aspect of the present invention, the speech application can be a speech browser. The speech browser can retrieve Web content responsive to voice commands received through the VoIP communications path. Also, the speech browser can speech synthesize the retrieved Web content into audio data. Finally, the speech browser can transmit the audio data through the
Aldous Anne M.
Celi, Jr. Joseph
Gavagni Brett
Leontiades Kyriakos
Lucas Bruce D.
Akerman & Senterfitt
Knepper David D.
LandOfFree
Voice over IP protocol based speech system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Voice over IP protocol based speech system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Voice over IP protocol based speech system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3149634