Multiplex communications – Pathfinding or routing – Combined circuit switching and packet switching
Reexamination Certificate
1998-11-30
2004-08-10
Pham, Chi (Department: 2663)
Multiplex communications
Pathfinding or routing
Combined circuit switching and packet switching
C370S410000
Reexamination Certificate
active
06775265
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates generally to methods and systems for communication of real-time audio, video, and data signals over a packet-switched data network, and more particularly to a method and system for minimizing delay induced by DTMF processing.
FIG. 1
is a diagram of the general topology of a packet telephony system
12
. The packet telephony system
12
includes multiple telephone handsets
14
connected to a packet network
18
through gateways
16
. The gateways
16
each include a codec for converting audio signals into audio packets and converting the audio packets back into audio signals.
The handsets
14
are traditional telephones or any other device capable of transmitting and/or receiving DTMF signals. Gateways
16
and the codecs used by the gateways
16
are any one of a wide variety of currently commercially available devices used for connecting the handsets
14
to the packet network
18
. For example, the gateways
16
can be Voice Over Internet Protocol (VoIP) telephones or personal computers that include a digital signal processor (DSP) and software for encoding audio signals into audio packets. The gateways
16
operate as a transmitting gateway when encoding audio signals into audio packets and transmitting the audio packets over the packet network
18
to a receiving endpoint. The gateways
16
operate as a receiving gateway when receiving audio packets over the packet network
18
and decoding the audio packets back into audio signals. Since packet telephony gateways
16
and codecs are well known, they are not described in further detail.
A conventional packet telephony gateway transmit path is shown in the transmitting gateway in FIG.
2
. The transmitting packet gateway
20
includes a voice encoder
22
, a packetizer
24
, and a transmitter
26
. Voice encoder
22
implements the compression half of a codec. Packetizer
24
accepts compressed voice data from encoder
22
and formats the data into packets for transmission. Transmitter
26
places the audio packets from packetizer
24
onto packet network
18
.
A receiving packet gateway
24
is shown in FIG.
3
. The receiving gateway
24
reverses the process utilized by transmitter
14
. A depacketizer
30
accepts packets from packet network
18
. A jitter buffer
32
buffers data frames and outputs them to voice decoder
34
in an orderly manner. A voice decoder
34
implements the decompression half of the codec employed by voice encoder
22
(FIG.
2
).
Low bit-rate codecs
22
,
34
typically model the bandpass filter arrangement of the human auditory system, including the frequency dependence of auditory perception, in allocating bits to different portions of a signal. In essence, low bit-rate encoding often involves many decisions to discard or ignore actual information not typically represented in human speech.
Because it is optimized for human speech, voice encoding can produce undesirable effects if the audio signal being encoded is not of this form. Computer modem and facsimile audio signals are examples of such signals; both can be badly distorted by voice encoding. Modems and facsimile machines employ in-band signaling, i.e., they utilize the audio channel of a telephony connection to convey data to a non-human receiver. However, modem and facsimile traffic do not “share” a voice line with a human speaker. Packet telephony systems can therefore detect such in-band traffic during call connection and switch it to a higher bandwidth, non-voice encoding channel.
Other types of in-band signals share a voice channel with a human speaker. Most common among these are the DTMF (dual-tone multi-frequency) in-band signals generated by a common 12-button telephone keypad. Voice mail, paging, automated information retrieval, and remote control systems are among the wide variety of automated telephony receivers that rely on DTMF in-band control signals keyed in by a human speaker.
Because the signal is carried “in-band” as part of the encoded voice stream, DTMF is poorly encoded by the system shown in
FIG. 2
if a low bit-rate coder is used. The reconstructed DTMF signals may be unrecognizable to an automated DTMF receiver. One popular low bit-rate coder, G.723.1, is widely recognized to have very poor DTMF fidelity. Other low bit-rate CODECs also have marginal DTMF fidelity upon decode and are therefore unsuitable without modification for many telephony applications, such as Interactive Voice Response (IVR).
In order to avoid these fidelity problems, more sophisticated packet telephony systems are capable of detecting DTMF in the transmitting gateway in parallel with voice encoding.
FIG. 4
depicts a parallel voice-encoding/DTMF detector transmitting packet gateway
38
. Transmitting gateway
38
operates a DTMF in-band signal detector
40
on an uncompressed audio data stream
20
, in parallel with voice encoder
22
. If speech is present in the data stream
20
, packetizer
24
will be supplied with a voice-encoded signal from encoder
22
. If a DTMF signal appears in the data stream, the DTMF signal, rather than the voice-encoded signal, is supplied separately to packetizer
24
. This system allows DTMF signals to effectively bypass the voice codec
22
, thereby avoiding DTMF signal distortion.
FIG. 4
depicts one of several different schemes where the suppression of the voice is done before packetization.
Although a parallel voice-encoding/DTMF detector packet telephony transmitter
38
can avoid DTMF fidelity problems, this capability comes at the price of higher latency. International Telecommunications Union (ITU) standards specify that a valid DTMF signal be at least 40 milliseconds (ms.) in duration. During the 40 ms. duration of a DTMF pulse, the voice encoder
22
is not allowed to ship frames containing voice-compressed DTMF. Otherwise, the receiver could garble the DTMF signal or identify two signals, the first voice-encoded signal and the second DTMF detector-generated signal.
To avoid this problem, voice encoder
22
delays all speech output by a fixed delay of at least 40 ms. to allow the DTMF detector
40
to detect valid DTMF samples. This delay allows the transmitter to switch smoothly from voice-encoding to DTMF transmission without causing confusion at the receiving packet gateway
24
(FIG.
3
). Unfortunately, this same delay adds to the call latency perceived by voice callers utilizing the packet voice connection.
The consequence for end-to-end delay in packet telephony system
12
(
FIG. 1
) is that all speech must be delayed by a minimum of 40 ms. in the transmitting gateway
38
. If this is not done, the receiving gateway would first receive 40 ms. of speech which is actually DTMF, followed after an unpredictable interval by the true DTMF packets. The receiving gateway then plays out one or the other or both, resulting in either garbled DTMF, or possibly a duplicated input such as two “9's” rather than one.
Accordingly, a need remains for accurately detecting and transmitting DTMF without adding additional end-to-end delay to the packet network.
SUMMARY OF THE INVENTION
The invention solves the problem of DTMF delay by shifting the delay and in-band signal processing to the receiving packet gateway. The invention exploits the fact that in any packet telephony system the receiving gateway already has a built-in playout delay in the form of a jitter buffer. The jitter buffer exists to smooth out the unavoidable delay variations in packet arrival introduced by the packet network.
The process of discarding or muting audio packets that contain DTMF signaling is shifted to the receiving gateway. The transmitting gateway can then continue to process and transmit voice packets while also detecting DTMF signals. The receiving gateway's jitter buffer holds voice packets for the worst-case DTMF detection period. As the receiving gateway is about to play out a voice packet it checks to see if a packet has arrived indicating DTMF was present. If not, the voice is played out as usual. If DTMF is present, the voice is muted and a DTMF generato
Cisco Technology Inc.
George Keith M.
Marger Johnson & McCollom PC
Pham Chi
LandOfFree
Method and apparatus for minimizing delay induced by DTMF... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for minimizing delay induced by DTMF..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for minimizing delay induced by DTMF... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3306663