Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
2002-07-03
2004-12-14
McFadden, Susan (Department: 2655)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S275000, C370S468000, C455S063100, C455S296000
Reexamination Certificate
active
06832195
ABSTRACT:
The following acronyms are used throughout this description. They are listed in TABLE 1 below for ease of reference.
TABLE 1
ACRONYM
Definition
ACELP
Algebraic Code Excited Linear Prediction
ACS
Active Codec Set
AFS
AMR Full rate Speech codec
AHS
AMR Half rate Speech codec
AMR
Adaptive Multi Rate speech codec
BER
Bit Error Rate
BSS or BTS
Base Station Subsystem/Base Transceiver Station
CDMA
Code Division Multiple Access
C/I
Carrier-to-Interference ratio (used to measure link
quality)
CMI
Codec Mode Indication (speech rate used on attached
link)
CMC
Codec Mode Command (speech rate commanded to be
used by an MS on its uplink)
CMR
Codec Mode Request (speech rate requested by an
MS to be used on its receiving link)
CRC
Cyclic Redundancy Check
DFI
Dangerous Frame Indicator
DTX
Discontinuous Transmission
EDGE
Enhanced Data-rates for GSM (or Global) Evolution
EFR
Enhanced Full Rate speech codec for GSM
EVRC
Enhanced Variable-Rate Codec, used in IS-95 CDMA
FACCH
Fast Associated Control Channel
FR
Full Rate speech codec for GSM
GSM
Global System for Mobile communications, common
digital cellular standard
HR
Half Rate speech codec for GSM
MS
Mobile station, e.g. a cellular phone
NO_DATA
Frame classification used to indicate no speech-related
data received, e.g. during DTX
ONSET
AMR frame used to demark the end of a DTX period, i.e.
start active voice
PDC
Personal Digital Cellular, Japanese digital cellular
standard
RATSCCH
Robust AMR Traffic Synchronized Control Channel
SID
Silence Description or Descriptor
SID_FIRST
AMR frame type used to demark the beginning
of a DTX period
SID_UPDATE
AMR frame used to convey comfort noise characteristics
during a DTX period
TDMA
Time Division Multiple Access, common digital cellular
standard
TRAU
Transcoding and Rate Adapting Unit
WCDMA
Wideband CDMA
3GPP
3
rd
Generation Partnership Project, WCDMA standard
BACKGROUND OF THE INVENTION
Digital communications systems, such as digital cellular telephony systems, are often used to transmit voice. Due to the limited bandwidth of these systems, speech is typically encoded to a low bit rate using a speech encoder. Various methods are in use for such speech coding. Within modern digital cellular telephony, most of these methods are based upon Code Excited Linear Prediction (CELP) or some variant thereof. Such speech codecs are standardized and in use for all of the major digital telephony standards including GSM/EDGE, PDC, TDMA, CDMA, and WCDMA.
The present invention is described within the context of GSM. Within this standard, there are currently four standardized speech codecs; three of which are fielded and in common use. The original speech codec is known as the full-rate (FR) codec. This was followed by the half-rate (HR) speech codec which required only half of the bandwidth of the FR codec thereby allowing cellular operators to support twice as many users within the same frequency allocation. This was followed by the Enhanced Full Rate (EFR) speech codec which required the same net bit rate (after channel coding) as the original FR codec but with much improved speech quality.
The GSM standard recently introduced the AMR speech codec. This speech codec will also be used in forthcoming EDGE and 3GPP cellular systems. A similar ACELP-based adaptable speech codec known as the EVRC has been standardized for IS-95 (narrowband) CDMA.
The present invention relates to the Adaptive Multi-Rate (AMR) speech codec. In broad terms, the invention improves the audio quality perceived within an AMR enabled receiver. More particularly, the invention serves to prevent two specific problems that can occur when an AMR enabled receiver is entering or exiting DTX mode. The first problem is that the link may enter DTX but the receiver may not recognize this state change. The result is that random data may be processed by the speech decoder during the DTX period leading to audible artifacts such as clicks and pops. The second problem is that a link in the DTX state may return to active voice but the AMR enabled receiver may not recognize this. The result is that the receiver is muted despite the active state of the link.
BRIEF SUMMARY OF THE INVENTION
One embodiment of the present invention comprises a method of determining whether a receiver in active (non-DTX) mode should remain in active (non-DTX) mode or switch to inactive (DTX) mode. A received AMR frame in active (non-DTX) mode is subjected to a RATSCCH marker comparison. If the results of the RATSCCH marker comparison exceed a RATSCCH marker threshold, the received AMR frame is processed as a RATSCCH message. Otherwise, the received AMR frame is subjected to a SID_FIRST marker comparison. If the results of the SID_FIRST marker comparison exceed a SID_FIRST threshold, then the received AMR frame is processed as a SID_FIRST frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a SID_UPDATE threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver is switched to DTX mode. Otherwise, the received AMR frame is processed as a voice frame in active (non-DTX) mode.
The SID_UPDATE threshold is determined by channel decoding the received AMR frame as a voice frame and performing a CRC test on the channel decoded AMR frame. If the CRC test passes, then a badFrameCounter variable is set to zero, otherwise the badFrameCounter is incremented by one; and the SID_UPDATE threshold is set according to the badFrameCounter.
Another embodiment of the present invention comprises a method of determining whether an AMR enabled receiver in inactive (DTX) mode should remain in inactive (DTX) mode or switch to active (non-DTX) mode. A received AMR frame in inactive (DTX) mode is subjected to an ONSET frame comparison. If the results of the ONSET frame comparison exceed a threshold, then the received AMR frame is processed as an ONSET frame and the receiver is switched to active (non-DTX) mode. Otherwise, the received AMR frame is subjected to a SID_UPDATE marker comparison. If the results of the SID_UPDATE marker comparison exceed a threshold, then the received AMR frame is processed as a SID_UPDATE frame and the receiver remains in inactive (DTX) mode. Otherwise, it is determined whether the received AMR frame is a voice frame, and if so, the receiver is switched to active (non-DTX) mode, and if not, the received AMR frame is classified as a NO_DATA frame and the receiver remains in inactive (DTX) mode.
There are several alternative processes for determining whether the received AMR frame is a voice frame. One method comprises channel decoding the received AMR frame as a voice frame and performing a CRC test on the channel decoded AMR frame. If the CRC test fails, then the received AMR frame is classified as a NO_DATA frame. If the CRC test passes, then a goodFrameCount variable is incremented by one. The goodFrameCount variable is compared against a threshold value and if the goodFrameCount variable exceeds the threshold value, then the received AMR frame is classified as a voice frame. Otherwise the received AMR frame is classified as a NO_DATA frame.
Another method comprises determining if the received AMR frame is a SID_FIRST frame, and if so, setting a framesSinceSID variable to zero and considering the received AMR frame as NO_DATA for purposes of speech decoding. Otherwise, it is determined if the received AMR frame is a SID_UPDTAE frame, and if so, setting the framesSinceSID variable to zero and considering the received AMR frame as NO_DATA for purposes of speech decoding. If the received AMR frame is neither a SID_FIRST or SID_UPDTAE frame then the framesSinceSID variable is incremented by one. Next, it is determined whether the framesSinceSID variable exceeds a threshold, and if not, the received AMR frame is classified as NO_DATA. Otherwise, the received AMR frame is channel decoded as a voice frame and a CRC test is performed on the channel decoded AMR frame. If it passes, the received AMR frame is classified as a voi
McFadden Susan
Moore & Van Allen PLLC
Sony Ericsson Mobile Communications AB
Stephens Gregory A.
LandOfFree
System and method for robustly detecting voice and DTX modes does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for robustly detecting voice and DTX modes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for robustly detecting voice and DTX modes will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3313390