Method of encoding digital data

Data processing: speech signal processing – linguistics – language – Speech signal processing – Psychoacoustic

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06370499

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method. of encoding digital data in which, when recording musical tones, sounds, etc. in recording media such as mini-discs, bits are allocated to the spectrum of each frequency band in response to the musical tones, sounds, etc. so as to compress data volume.
BACKGROUND OF THE INVENTION
One method of highly efficient compressed encoding of digital data such as musical tones and sounds is ATRAC (Adaptive Transform Acoustic Coding), used in mini discs. In ATRAC, since the digital data is compressed with high efficiency, it is first broken down into a plurality of frequency bands, then divided into blocks in accordance with time units of variable length, transformed into spectral signals by MDCT (Modified Discrete Cosine Transform) processing, and then each spectral signal is encoded by the number of quantized bits which have been allocated to it, taking into account aural-psychological characteristics.
Among the aural-psychological characteristics which can be applied to the compressed encoding are loudness-level characteristics and masking effect. Loudness-level characteristics show that, even with the same sound pressure level, the loudness of a sound sensed by a person changes according to the frequency of the sound. Accordingly, this shows that the minimum limit of audibility, which shows the smallest loudness which can be heard by a person, changes according to the frequency. As for masking effect, there are two kinds: simultaneous masking effect and elapsed masking effect. Simultaneous masking effect is a phenomenon in which, when several sounds of different frequency composition occur simultaneously, one sound makes another difficult to hear. Elapsed masking effect is a phenomenon in which the masking occurs before and after a loud sound along the time axis of the loud sound.
An example of conventional art which makes use of the elapsed masking effect is Japanese Unexamined Patent Publication No. 5-91061/1993. In this conventional art, when a transient signal is included in one of the frequency conversion time units, bits are allocated in accordance with a word length which varies depending on the energy of previous time units and on the amount of masking, thereby preventing a sound quality deterioration called “pre-echo.” Again, Japanese Unexamined Patent Publication No. 5-248972/1993 proposes a technique for improving the efficiency of encoding by using elapsed masking in reference to the spectral distribution of previous time units.
Another example of bit allocation using the aural-psychological characteristics is one called the repetition method, in which actual bit allocation suited to input digital data is performed as follows. First, the power S of each frequency band, and the masking threshold M of that power S on the other frequency bands, are found. Next, from the masking threshold M and the power of quantized noise N(n) (when each frequency band is quantized into n bits), is calculated the ratio of the masking threshold to noise, being MNR(n)=M/N(n). Then, after bit allocation for the frequency band with the smallest ratio of masking threshold to noise MNR(n), that ratio of masking threshold to noise MNR(n) is recalculated, and bits are allocated to the frequency band with the lowest ratio.
Note that the aural characteristics of persons with typical aural characteristics are the model for the minimum limit of audibility, masking threshold, etc. mentioned above. Accordingly, there are cases where listeners will feel a sense of incongruity due to differences in hearing or preference.
For example, in cases where the spectral composition of the input digital data is comparatively flat, like white noise, bit allocation will be made with the masking threshold at the minimum limit of audibility, so most of the quantized bits will be allocated to the mid- to low-range. Accordingly, depending on the size of the spectral composition, quantized bits may not be allocated to the ultra-low and ultra-high ranges, giving some listeners a sense of incongruity.
Again, when the input digital data is a composite wave composed of a signal with a narrow spectrum band (such as a sine wave signal) and white noise, the frequency bands f
1
which include the sine wave signal will have more power, but as for frequency bands f
2
which are far from the frequency bands f
1
, the farther from the frequency bands f
1
, the greater the drop in power. Accordingly, there will be almost no masking from the sine wave signal at a frequency band f
2
, and the influence of masking from the power of the frequency band f
2
itself is increased. Because of this, there will be no great difference between the ratio of signal to masking threshold (SMR: the ratio of a frequency band's own power S to masking threshold M) at the frequency bands fl and the same ratio SMR at the frequency bands f
2
.
In other words, if the power of a signal is S, and the power of quantized noise is N(n) when each frequency band is quantized into n bits, then, based on the relative relationship between the two, the ratio of masking threshold to noise MNR(n)=M/N(n)=(S/N(n))/(S/M(n)) will be approximately the same value at the frequency bands f
1
and f
2
. Accordingly, since the conventional adaptive bit allocation methods perform bit allocation based only on the ratio of masking threshold to noise MNR(n), their drawback is that approximately the same number of bits are allocated to the frequency bands f
1
and f
2
.
As a result, if there are many frequency bands f
2
which are not influenced by the masking from the sine wave signal, the number of bits allocated to the frequency bands f
1
which include the sine wave signal becomes relatively smaller, the quantization error of the sine wave signal becomes greater, and sound quality deteriorates.
In regard to this point, the present Applicant has proposed, in Japanese Unexamined Patent Publication 7-202823/1995, a structure which automatically limits the number of bits which may be allocated to frequency bands with low power S. However, a drawback of this conventional art is that, since the maximum number of bits which may be allocated to each frequency band is determined on the basis of its power, when the power of white noise is large, there are cases when no limitation on bit allocation to that frequency band is made.
SUMMARY OF THE INVENTION
One object of the present invention is to provide a method of encoding digital data capable of attaining a sound quality which accords with the listener's hearing.
Another object of the present invention is to provide a method of encoding digital data capable of preventing deterioration of sound quality even of signals with narrow spectrum bands.
In order to realize the first object mentioned above, the first method of encoding digital data of the present invention encodes digital data such as musical tones and sounds by converting it into frequency domains, dividing the converted spectra into a plurality of frequency bands, changing a minimum limit of audibility characteristic so as to set a masking threshold, and allocating quantized bits for each frequency band in accordance with ratios of masking threshold to noise which are found for each frequency band in accordance with power or energy of each frequency band in consideration of aural-psychological characteristics.
The above structure, by enabling change of the minimum limit of audibility characteristic among aural-psychological characteristics, frees aural-psychological characteristics from definition by the characteristics of persons with typical hearing, and makes possible selection of whether or not to allocate bits to spectra with small inaudible domains, or spectra with ultra-low or ultra-high domains. Accordingly, it becomes possible to respond to persons with superior hearing or to individual, subjective preference, and sound quality which accords with listeners' hearing can be attained.
Next, in order to realize the first object. mentioned above, the second method of encoding dig

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of encoding digital data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of encoding digital data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of encoding digital data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2820801

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.