Real time audio transmission system supporting asynchronous...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Real time audio transmission system supporting asynchronous... Real time audio transmission system supporting asynchronous...

: 2000-08-28
: 2003-09-02
: To, Doris H. (Department: 2655)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Synthesis

: C704S201000, C704S503000
: Reexamination Certificate
: active
: 06615173
: ABSTRACT:

CROSS REFERENCE TO RELATED APPLICATIONS
(Not Applicable)
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
(Not Applicable)
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates to the field of speech enabled computing and more particularly to a system and method for real time transmission of speech audio asynchronously received from a text-to-speech engine in a computer communications network.
2. Description of the Related Art
Text-to-speech (TTS) engines are well-known in the art. Typically, a TTS engine can be used to convert computer recognizable text to audio which can be transmitted to an external audio device for ultimate audible presentation to a listener. Specifically, TTS technology permits users to audibly play back documents and provides applications with the ability to read information to the user. Whether running on a desktop computer, a telephony network, over the Internet, or in an automobile, the increased functionality of TTS-enabled applications can provide users with information access anytime, anywhere with almost any device.
In the telephony environment, TTS technology can convert text to speech, reducing the need for prerecorded interactive voice response (IVR) messages and providing users with the ability to access textual information over a telephone. The advent of Voice over IP (VolP) technology has facilitated the development of enabled applications over networks. This network convergence has opened the door to TTS-novel applications, for example voice browsing of Web sites over the Internet.
In order to transmit audio data over a computer communications network, a media transport protocol typically is employed. Presently, the Real Time Transport Protocol (RTP) is a preferred protocol for transporting real time media over a computer communications network. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP is described in detail in Schulzrinne, Casner, Frederick and Jacobson, RFC1889, RTP: A Transport Protocol for Real-Time Applications published by Internet Engineering Task Force (IETF) in January 1996 and incorporated herein by reference.
Notwithstanding, the output of a TTS engine is not ideal for real time transmission using RTP. For example, while a VoIP telephony gateway can require speech audio to arrive in the telephony gateway in a synchronized fashion in a specific format according to an underlying media protocol, the output of a TTS engine can take the form of chunks of speech audio that asynchronously can be provided at random time intervals by the TTS engine. Moreover, the chunks of speech audio can have a varying size. Finally, the format of data received from a TTS engine can vary from application to application. Accordingly, what is needed is a system and method for real time transmission of speech audio asynchronously received from a TTS engine in a computer communications network.
SUMMARY OF THE INVENTION
The present invention is system and method for real time transmission of speech audio asynchronously received from a text-to-speech (TTS) engine in a computer communications network. A system for real time transmission of speech audio asynchronously received from a TTS in a computer communications network can include a TTS engine for producing speech audio for transmission in the computer communications network; and, a real time speech audio producer for receiving the speech audio and for producing formatted audio packets for transmission over the network according to a transmission interval.
Notably, the transmission interval can fixed or variable and can be determined according to a packetization delay parameter. In addition, the real time speech audio producer can implement a thread for execution in a multi-threaded application. Finally, the system can further include a telephony gateway server communicatively linked to the real time speech audio producer. As such, the telephony gateway server can receive the produced formatted audio packets transmitted according to the transmission interval.
In a representative embodiment of the present invention, the real time speech audio producer can include a TTS audio receiver for receiving the produced speech audio from the TTS engine; an audio data compressor for compressing the received speech audio into an audio buffer; a speech audio packet formatter for formatting speech audio in the audio buffer into formatted audio packets suitable for transmission over the network; and, a transmission queue for queuing the formatted audio packets for transmission over the network. The real time speech audio producer can also include a silence detector for detecting transmission intervals in which no speech audio data from the TTS engine is available for transmission across the network; and, a silence packet generator for producing formatted silence packets in lieu of the uniformly formatted audio packets responsive to detecting the intervals in which no speech audio data from the TTS engine is available for transmission across the network.
A method for real time transmission of speech audio received from a TTS engine in a computer communications network can include receiving speech audio from the TTS engine; formatting the received speech audio into formatted audio packets suitable for transmission to an audio output device over the computer communications network; and, transmitting the formatted audio packets to the audio output device over the computer communications network according to a transmission interval. The method can further include detecting transmission intervals in which no speech audio data from the TTS engine is available for transmission across the network; and, formatting silence packets and transmitting the silence packets in lieu of the audio packets responsive to detecting the transmission intervals in which no speech audio data from the TTS engine is available for transmission across the network.
In a representative embodiment of the method of the invention, the method can also include compressing the speech audio into an audio buffer from which the audio packets can be formatted in the formatting step. In another representative embodiment, the method can further include queuing the formatted audio packets for transmission to the audio output device over the computer communications network according to the fixed transmission interval. In yet another representative embodiment, the method can further include queuing the formatted audio packets and the formatted silence packets for transmission to the audio output device over the computer communications network according to the transmission interval.
Notably, the step of transmitting the formatted audio packets to the audio output device over the computer communications network according to a transmission interval can include transmitting the formatted audio packets to a telephony gateway server over the computer communications network according to a transmission interval. Moreover, the method can also include determining the transmission interval according to a packetization delay parameter.
Advantageously, the method can be implemented in a multi-threaded application as a producer in a producer/consumer model for providing digitized speech audio over the network. In that instance, the method can include implementing the formatting and transmitting steps in a thread for execution in the multi-threaded application. Additionally, the method can include implementing the formatting audio packets step, the transmitting the audio packets step, the detecting step, and the formatting and transmitting the silence packets step in a thread for execution in a multi-threaded application. Finally, the method can include implementing the compressing step in the thread and the queuing step in the thread.

REFERENCES:
patent: 4782485 (1988-11-01), Gollub
patent: 5018136 (1991-05-01), Gollub
patent: 5404522 (1995-04-01), Carmon et al.
patent: 5526353 (1996-06-01), Henley

Affiliated with

Celi, Jr. Joseph

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Nolan Daniel

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Senterfitt Akerman

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

To Doris H.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Real time audio transmission system supporting asynchronous... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Real time audio transmission system supporting asynchronous..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Real time audio transmission system supporting asynchronous... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3095158

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure