DATA PROCESSING APPARATUS FOR PROCESSING SOUND DATA, A DATA...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S201000

Reexamination Certificate

active

06772113

ABSTRACT:

TITLE OF THE INVENTION
A data processing apparatus, a data processing method, a program providing medium, and a recording medium
BACKGROUND OF THE INVENTION
The present invention relates to a data processing apparatus and a data processing method for dealing with sound data, a program providing medium for providing a program for providing a program for dealing with sound data, and a recording medium in which sound data is recorded.
In recent years, owing to developments in high-efficiency encoding techniques, it is general to compress/encode sound data when keeping sound data. There is a necessity for a method of efficiently retrieving desired sound data among a number of encoded sound data pieces.
FIG. 1
shows a functional structure of a conventional sound data retrieving apparatus. Sound data (hereinafter called encoded sound data) which has been subjected to predetermined compression encoding processing, and a retrieving text database which describes attribute information associated with the encoded sound data (e.g., title, creator's name, creation data, classification of the content, and the like) are previously recorded in the database
156
of this sound data retrieving apparatus.
The retrieving condition input section
151
receives an input of a retrieving condition/by a user. For example, attribute information and the signal characteristic or the like of a sample waveform are inputted as a retrieving condition. Further, the retrieving condition input section
151
supplies the attribute retrieving section
152
with attribute information (e.g., name of the creator and the like) inputted as a retrieving condition, and also supplies the comparative determination section
155
with the signal characteristic (e.g., the waveform amplitude and the like) inputted also as a retrieving condition.
The attribute retrieving section
152
retrieves an item which matches with the attribute information inputted through the retrieving condition input section
151
, from the retrieving text database recorded in the database
156
, and extracts encoded sound data corresponding to the item.
The candidate selection section
153
sequentially outputs the encoded sound data inputted from the attribute retrieving section
152
to the decoding section
154
. The decoding section
154
decodes the encoded sound data inputted from the candidate selection section
153
and outputs the data to the comparative determination section
155
.
The comparative determination section
155
obtains a level of similarity between the sound data inputted from the decoding section
154
and the signal characteristic of the sample waveform supplied from the retrieving condition input section. If the similarity is a predetermined threshold value or more, the section
155
outputs the sound data as a retrieving result. To obtain the similarity, for example, correlation factors concerning waveform amplitudes, amplitude average values, power distributions or frequency spectrums, and the like are calculated with respect to the sample waveform and the sound data as a target to be retrieved.
Next, explanation will be made of a encoding apparatus which generates encoded sound data previously recorded in the database
156
shown in FIG.
1
. Prior to explanation of the structure of the encoding apparatus, a method of compressing/encoding efficiently sound data will be explained.
Methods of efficiently compressing/encoding sound data can be roughly classified into a band division encoding system and a conversion encoding system. However, there is a system which combines both systems.
In the band division encoding system, a discrete-time waveform signal (e.g., sound data) is divided into a plurality of frequency bands by a band division filter such as a quadrature mirror filter (QMF) or the like, and optimal encoding is performed on each of the bands. This system is also called a sub-band encoding system. Details of the quadrature mirror filter are described in, for example, “P. L. Chu, “Quadrature mirror filter design for an arbitrary number of equal bandwidth channels”, IEEE Trans. Acoust. Speech, Signal Processing, vol. ASSP-33, pp203-128, February 1985.
The conversion encoding system is also called a block encoding system in which a discrete-time waveform signal is divided into blocks each consisting of a predetermined sample unit, and the signal of this block (called a frame in some cases) is converted into frequency spectrums and is thereafter encoded. The type of the method for thus converting the signal into frequency spectrums is, for example, DFT (Discrete Fourier Transfonn), DCT (Discrete Cosine Transfonn), MDCT (Modified Discrete Cosine Transfonn), or the like. In the MDCT, adjacent blocks on the time axis and converter sections are overlapped on each other, and thus, efficient conversion can be achieved with less block distortion. The details are described in, for example, “Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation”: J. P. Princen, A. B. Bradley, IEEE Transactions, ASSP-34, No. 5, October 1986. pp1153-1161”, and “Subband/Transfonn Coding Using Filter Band Design Based on Time Domain Aliasing Cancellation”: J. J. Princern, A. W. Johnson and A. B. Bradley (ICASSP 1987).
The signal which is divided for every frequency band in the case of the band division encoding system or which is divided into a frequency spectrum in the case of the conversion encoding system is quantized and then encoded. In this manner, the band which causes quantization noise can be restricted with use of an auditory characteristic called a masking effect or the like. In addition, by normalizing each signal before the quantization, effective encoding can be carried out.
For example, if quantization is carried out in the band division encoding system, the signal should be desirably be divided for every bandwidth which is called a critical band.
Bit allocation is performed on each signal thus divided by the frequency bandwidth and thus encoded. For example, if bit allocation is dynamically carried out based on the absolute value of the amplitude of the signal for each band, the quantization noise spectrum is flattened so that the noise energy is minimized. Note that this method is described in, for example, “Adaptive Transform Coding of Speech Signals”: R. Zelinski and P. Noll, IEEE Transactions of Accorstics Speech and signal Processing, vol. ASSP-25, No. 4, August 1997. However, there is a problem that this method is not auditorily most preferred since the masking effect is not used.
In addition, if fixed bit allocation is carried out such that an excellent S/N ratio is obtained for every band, for example, a masking effect can be obtained auditorily. However, in cases where the characteristic of a sine wave is measured, there is a problem that an excellent characteristic value cannot be obtained since bit allocation tin is fixed. Note that this method is described in, for example, “The critical band coderdigital encoding of the perceptual requirements of the auditory system”: M. A. Kransner, MIT, (ICASSP 1980).
To solve these problems, in a method, all the bits that can be used for bit allocation are divided into dynamic allocation and fixed allocation, and the division ratio is rendered dependent on the input signal such that the rate of the fixed allocation is greater as the spectral distribution of the input signal is smoother, for example, thus achieving efficient encoding.
Meanwhile, in quantization and encoding of sound signals, quantization errors increase in such a waveform that includes a sharp change point of amplitude (hereinafter called an attack) at which the amplitude sharply increases or decreases within a part of a sound waveform increases. Also, in a signal encoded by the conversion encoding system, quantization errors of spectral coefficients at the attack spread over the entire block within a time area during reverse spectral conversion (decoding). Due to influences thereof, auditorily harsh noise called a pre-echo is generated immediately before or after a sharp increase po

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

DATA PROCESSING APPARATUS FOR PROCESSING SOUND DATA, A DATA... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with DATA PROCESSING APPARATUS FOR PROCESSING SOUND DATA, A DATA..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and DATA PROCESSING APPARATUS FOR PROCESSING SOUND DATA, A DATA... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3272977

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.