Multiplex communications – Pathfinding or routing – Combined circuit switching and packet switching
Reexamination Certificate
1998-10-26
2001-10-02
Chin, Wellington (Department: 2664)
Multiplex communications
Pathfinding or routing
Combined circuit switching and packet switching
C370S384000
Reexamination Certificate
active
06298055
ABSTRACT:
FIELD OF THE INVENTION
This invention pertains generally to packet telephony methods and systems, and more particularly to packet telephony methods and systems that receive in-band signaling and employ low bit-rate encoders.
BACKGROUND OF THE INVENTION
Packet telephony involves the transmission of audio signals in discrete blocks, or packets, of digital data.
FIG. 1
depicts a typical prior art packet telephony communication path
18
. Packet telephony transmitter
14
converts a digitized audio stream
20
, e.g., audio sampled at 8 kHz and quantized at 8 bits/sample, into packets. Transmitter
14
places these packets onto packet network
28
, which routes the packets to packet telephony receiver
16
. Receiver
16
converts packet data back into a continuous digital audio stream
36
which resembles input audio stream
20
. Transmitter
14
and receiver
16
typically employ a codec (a compression/decompression algorithm) to reduce the communication bandwidth required for path
18
on packet network
28
.
A basic packet voice transmitter
14
includes a voice encoder
22
, a packetizer
24
, and a transmitter
26
. Voice encoder
22
implements the compression half of a codec, compressing audio stream
20
to a lower bit-rate. Packetizer
24
accepts compressed voice data from encoder
22
and formats the data into packets for transmission. Transmitter
26
places voice packets from packetizer
24
onto network
28
.
Receiver
16
reverses the process utilized by transmitter
14
. Depacketizer
30
accepts packets from network
28
. Jitter buffer
32
buffers received data frames and outputs them to voice decoder
34
in an orderly manner. Voice decoder
34
implements the decompression half of the codec employed by voice encoder
22
.
Low bit-rate voice codecs used in a packet voice encoder/decoder pair
22
,
34
exploit physiological limitations on human hearing ability in order to reduce bit rate. One such human limitation is termed the spectral masking effect, i.e., high energy sound at one frequency masks lower-energy sound at nearby frequencies in the human auditory system. A codec may choose to ignore potentially masked sounds when coding, since a human will be unable to hear them even if they were faithfully reproduced. Low bit-rate codecs typically also model the bandpass filter arrangement of the human auditory system, including the frequency dependence of our auditory perception, in allocating bits to different portions of a signal. In essence, low bit rate encoding involves many decisions to throw away actual audio information that is undetectable or only marginally detectable by a human.
Because it is optimized for humans, voice encoding can produce undesirable effects if the audio signal being encoded is not meant for human hearing. Computer modem and facsimile audio signals are examples of such signals; both can be badly distorted by voice encoding. Modems and facsimile machines employ in-band signaling, i.e., they utilize the audio channel of a telephony connection to convey data to a non-human receiver. However, modem and facsimile traffic do not “share” a voice line with a human speaker. Packet telephony systems can therefore detect such in-band traffic during call connection and switch it to a higher bandwidth, non-voice encoding channel.
Other types of in-band signals share a voice channel with a human speaker. Most common among these are the DTMF (dual-tone multi-frequency) in-band signals generated by a common 12-button telephone keypad. Voice mail, paging, automated information retrieval, and remote control systems are among the wide variety of automated telephony receivers that rely on DTMF in-band control signals keyed in by a human speaker.
Voice encoding/decoding of DTMF signals can render these signals unrecognizable to an automated DTMF receiver. More sophisticated packet telephony systems are capable of detecting DTMF in an input audio data stream in parallel with voice encoding.
FIG. 2
depicts a parallel voice-encoding/DTMF detector packet telephony transmitter
38
. Transmitter
38
operates a DTMF in-band signal detector
40
on uncompressed audio data stream
20
, in parallel with voice encoder
22
. If speech is present in data stream
20
, packetizer
24
will be supplied with a voice-encoded signal from encoder
22
. If a DTMF signal appears in data stream
20
, the DTMF signal, rather than the voice-encoded signal, is supplied separately to packetizer
24
. This system allows DTMF signals to effectively bypass the voice codec, thereby avoiding DTMF signal distortion.
SUMMARY OF THE INVENTION
Although a parallel voice-encoding/DTMF detector packet telephony transmitter
38
can avoid DTMF fidelity problems, this capability comes at the price of higher latency. The International Telecommunications Union (ITU) recommends that a valid DTMF signal be at least 40 ms in duration. During the 40 ms duration of a DTMF pulse, if a voice encoder is allowed to ship frames containing voice-compressed DTMF, the receiver may garble the DTMF signal, or identify two signals (a first voice-encoded signal and a second DTMF detector-generated signal). To avoid this problem, voice encoder
22
must delay all speech output by a fixed delay of at least 40 ms to allow DTMF detector
40
to detect valid DTMF samples. This delay allows the transmitter to switch smoothly from voice-encoding to DTMF transmission without causing confusion at the receiving end. Unfortunately, this same delay adds to the call latency perceived by voice callers utilizing a packet voice connection so-equipped.
Voice callers utilizing the present invention can enjoy reliable DTMF capability over a packet network, without suffering a fixed latency penalty due to DTMF recognition. The present invention avoids adding fixed latency by performing a preliminary DTMF detection-essentially, an early detection of potential DTMF signals based on a leading portion of a DTMF pulse. If recent audio data samples are consistent with a leading portion of a DTMF signal, the present invention delays encoded speech transmission while validating the presence or absence of a complete DTMF signal of the appropriate duration. If no potential in-band signal has been detected, voice-encoded frames are not held up. As most DTMF false alarms can be rejected within one frame of voice data, delay of true voice frames will occur relatively rarely, as opposed to the continuous delay found in prior art systems.
If a potential in-band signal is detected, the present invention delays voice frames in a buffer while it resolves the presence of an in-band signal. If an in-band signal of the proper duration is present, delayed voice frames are discarded from the buffer and an in-band signal is transmitted instead. If the potential in-band signal turns out to be a false alarm, delayed voice frames are immediately released from the buffer for transmission. The small amount of packet jitter caused by false alarm delays is easily handled by the receiver, which is designed to handle relatively large jitter present on a packet network. No degradation in speech quality should result from false alarms.
In one aspect of the present invention, a packet voice transmitter comprises a frame delay buffer, a frame-based voice encoder, an in-band signal signature detector, and an in-band signal detection manager. Voice data normally follows a first path through the transmitter, one that bypasses the frame delay buffer. The in-band signal detection manager can select a second voice data path that includes the frame delay buffer. The in-band signal detection manager relies on the in-band signal signature detector to notify it of potential in-band signals. The detection manager responds appropriately by controlling the data path and frame delay buffer.
In a further aspect of the invention, a method of transmitting digital audio signals is disclosed. Generally, this method comprises scanning an audio stream for consistency with a leading portion of an in-band signal, and upon detecting such a consistency, digitally delaying transmission of
Chin Wellington
Cisco Technology Inc.
Marger & Johnson & McCollom, P.C.
Pham Brenda
LandOfFree
Early detection of in-band signals in a packet voice... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Early detection of in-band signals in a packet voice..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Early detection of in-band signals in a packet voice... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2591341