Techniques for quantization of spectral data in transcoding

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S230000

Reexamination Certificate

active

06757648

ABSTRACT:

TECHNICAL FIELD
The present invention relates to quantization of spectral data in transcoding. In one embodiment, an audio transcoder phase shifts decompressed PCM audio data before transform coding and requantizing the data. The phase shifting reduces excess requantization error in the requantized data.
BACKGROUND
A computer processes audio or video information as a series of numbers representing samples of the audio or video information. For high quality audio or video, the computer represents a sample of information using a number with many possible values. The more values possible for the sample, the higher the quality because the number can capture more variations in sound or color. Table 1 shows ranges of possible values for several types of audio or video information of different quality levels, along with corresponding bitrate costs.
TABLE 1
Ranges of values and cost per value for different quality audio and
video information
Number of
Information type and quality
possible values
Cost
audio sequence, voice quality
0-255 per sample
8 bits (1 byte)
audio sequence, CD quality
0-65,535 per sample
16 bits (2 bytes)
video image, black and white
0-1 per pixel
1 bit
video image, gray scale
0-255 per pixel
8 bits (1 byte)
video image, “true” color
0-16,777,215 per pixel
24 bits (3 bytes)
As Table 1 shows, the cost of high quality audio and video information is high bitrate. High quality audio and video information consumes large amounts of computer storage and transmission capacity.
Compression (also called encoding or coding) decreases the cost of storing and transmitting audio and video information by converting the information into a lower bitrate form. Decompression (also called decoding) extracts a reconstructed version of the original information from the compressed form.
Quantization is a conventional compression technique. Quantization maps ranges of input values to single values. For example, a sample with a value anywhere between −1.5 and 1.499999 is mapped to 0, a sample with a value anywhere between 1.5 and 4.499999 is mapped to 1, etc.
To reconstruct the sample, the quantized value is multiplied by the quantization factor. After a value has been quantized, however, the original value cannot be precisely reconstructed. In essence, quantization decreases the quality of the signal in order to decrease the bitrate of the signal. Continuing the example started above, the quantized value 1 reconstructs to 1×3=3; it is impossible to determine where the original value was in the range 1.5 to 4.499999.
Several factors affect quantization. For a continuous, analog signal, a dynamic range sets the boundaries of the quantization. Suppose the range of an analog signal is infinite but most samples are close to zero. The dynamic range of the quantization focuses the quantization on the range most likely to yield real information, for example, around zero. For a signal already in numerical form, the dynamic range is bounded by the lowest and highest possible values.
Within the dynamic range, the number of quantization levels affects how closely the quantized signal tracks the input signal. For example, if a dynamic range has 64 quantization levels, each sample is assigned to one of 64 values. Increasing the number of quantization levels in the same dynamic range increases precision and decreases distortion, but also increases bitrate. Quantization step size Q is a related factor that measures the distance between reconstructed values.
There are many different kinds of quantization. In uniform, scalar quantization, each single sample in a signal is quantized by the same step size Q to produce a quantized value. For example, a uniform scalar quantizer maps a set of real numbers {u} into an integer set {−M/2, . . . , −1, 0, 1, . . . M/2}, where M is the dynamic range of the quantizer and Q is the real number quantization step size. The quantizer produces quantized output according to the following equation:
q

(
u
)
=
r



o



u



n



d

(
min

(
max

(
u
,
-
Q



M
/
2
)
,
Q



M
/
2
)
Q
)
,
(
1
)
where round is a function for rounding to the closest integer, and the min and max functions set a number outside of the dynamic range to a range boundary value. Other quantization formulas follow different conventions.
The difference between an input value for a sample and its reconstructed value is quantization error. If the input value falls within the dynamic range of the quantizer, quantization error for a sample is no more than Q/2. The larger the quantization step size Q, the greater the potential quantization error. The distortion D is a measure of quantization error for the entire signal, and can be calculated as the square of the differences between the original values and the reconstructed values.
D=
(
u−q
(
u
)
Q
)
2
  (2).
Aside from uniform, scalar quantization, other quantization techniques include non-uniform quantization and vector quantization. Quantization can be non-adaptive or adaptive. For more information about quantization and the factors affecting the results of quantization, see Gibson et al.,
Digital Compression for Multimedia
, “Chapter 4: Quantization,” Morgan Kaufman Publishers, Inc., pp. 113-138 (1998).
Quantization helps a compressor reduce the bitrate of audio or video information at some cost to quality. The compressor can use various techniques to provide the best possible quality for a given bitrate, as measured by lowest objective or subjective distortion. These techniques include rate control, transform coding, and masking.
With rate control, a compressor adjusts quantization based upon a rate-distortion function that relates distortion (and hence quantization) to bitrate. The compressor dynamically adjusts quantization to utilize available bitrate.
Transform coding techniques convert data into a form that makes it easier to separate perceptually important information from perceptually unimportant information. The less important information can then be quantized heavily, while the more important information is largely preserved, so as to provide the best quality for a given bitrate. Transform coding techniques typically convert data to the frequency (or spectral) domain. For example, a transform coder converts a time series of audio samples into frequency coefficients, or, for video, transform coder converts pixel data into frequency coefficients. In the frequency domain, low frequency data has greater perceptual importance than high frequency data. Transform coding techniques include discrete cosine transform (“DCT”), modulated lapped transform (“MLT”), fourier transform, subband coding, and wavelets. In practice, input to transform coding techniques is partitioned into blocks, and each block is transform coded. Blocks may or may not overlap. For more information about transform coding, see Gibson et al.,
Digital Compression for Multimedia
, “Chapter 7: Frequency Domain Coding,” Morgan Kaufman Publishers, Inc., pp. 227-262 (1998).
Masking involves processing spectral data to emphasize perceptually important spectral data, and is typically done prior to quantization. This makes the perceptually important spectral data more robust to the subsequent quantization. Masking itself typically involves selective quantization, applying different levels of quantization to different ranges of spectral data, or can be performed as part of non-uniform or vector quantization.
Compression decreases the bitrate of audio and video information, which reduces storage and transmission costs. Different end users have different storage and transmission capacities, however, as well as different quality requirements. Thus, for example, a Web site operator would like to be able to stream an audio clip previously compressed to 128 kilobits/second (“Kb/s”) to certain end users at 64 Kb/s. A particular end user might then recompress the 64 Kb/s audio clip to 32 Kb/s to save local storage space. In add

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Techniques for quantization of spectral data in transcoding does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Techniques for quantization of spectral data in transcoding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Techniques for quantization of spectral data in transcoding will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3356532

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.