Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
2000-05-19
2003-02-11
Dorvil, Richemond (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S204000, C704S211000
Reexamination Certificate
active
06519558
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a signal processing method and apparatus in which a coded signal is decoded and its pitch is shifted, and an information-serving medium for serving a program which implements the signal decoding and pitch shifting.
2. Description of the Related Art
There has been known a technique for shifting the interval (pitch) of a sound signal by re-sampling the sound signal recorded in a pulse code-modulated (PCM) state at intervals different from those at which the sound signal has been sampled for pulse code compression (PCM). For example, a sound one octave lower than an original sound signal can be reproduced by reproducing, as sample values acquired at the original sampling rate, a two times larger number of sample values than that of the original sound signal sample values, acquired by sampling at a sampling rate two times higher than the original sampling rate within the same unit time as that for the original sound signal, while interpolating the difference between the original sound signal sample values, or by reproducing at the original sampling rate each of the samples acquired by re-sampling, by which the number original sound signal samples is halved. However, when a sound having a higher pitch than the original sound is reproduced (namely, the sound pitch is raised), so-called aliasing will take place. To avoid this, it is necessary to pass a signal yet to re-sample through a low-pass filter for example. In the above example, a part of the sample after being re-sampled coincides with the original sample. However, the sample part is not always necessary. Generally, by re-sampling the sound signal at an arbitrary rate while interpolating the difference between samples, it is possible to shift the interval (namely, to control the pitch).
On the other hand, a highly efficient coding method has been proposed to compress an audio or sound data with little degradation in sound quality of the data in hearing the sound. An audio signal can be coded with a high efficiency in various manners. The highly efficient audio data coding methods include, for example, a so-called transform coding being a blocked frequency band division method in which an audio signal on a time base is blocked in predetermined time units, the time base signal in each block is transformed (spectrum-transformed) to a signal on a frequency base, the signal thus acquired is divided into a plurality of frequency bands, and the signal in each subband is coded, and a so-called subband coding (SBC) being a non-blocked frequency band division method in which an audio signal on a time base is divided into a plurality of frequency bands without blocking it, and the signal in each subband is coded.
The subband coding (SBC) uses a subband filter which is a so-called quadrature mirror filter (QMF) or the like. The QMF filter is known from the publication “Digital Coding of Speech in Subbands” (R. E. Crochiere, Bell Syst. Tech. J., Vol, 55, No. 8, 1976). The QMF filter is characterized in that when two bands having the same bandwidth are recombined later, no aliasing will take place. More specifically, there is a fact that an aliasing taking place in a signal halved, for example, for the band division and an aliasing taking place in a signal synthesized by recombining the half signals, will cancel each other. Therefore, if the signal of each subband is coded with a sufficiently high accuracy, the QMF filter can eliminate almost perfectly the loss caused by the signal coding.
Also the publication “Polyphase Quadrature Filters—A New Subband Coding Technique” (Joseph H. Rothweiler, ICASSP 83, Boston) describes a polyphase quadrature filters which provide an equal-bandwidth division by filters. The PQF filter is characterized in that a signal can be divided into a plurality of equal-width subbands at a time and no aliasing takes place when the signals of the subbands are recombined later. More particularly, an aliasing taking place between a signal thinned at a rate for each bandwidth and an adjoining subband and an aliasing taking place between adjoining subbands recombined later, will cancel each other. Therefore, if the signal of each subband is coded with a sufficiently high accuracy, the PQF filter can eliminate almost perfectly the loss caused by the signal coding.
Further, the spectrum transform can be effected by blocking an input audio signal for predetermined unit times (frames) and transforming a time base to a frequency base by the discrete Fourier Transform (DFT), discrete cosine transform (DCT), modified discrete cosine transform (MDCT) or the like. The MDCT is further described in the publication “Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation” (J. P. Princen, A. B. Bradley, Univ. of Surrey Royal Melbourne Inst. of Tech. ICASSP, 1987).
When the DFT or DCT is used for spectrum transform of a waveform signal, M pieces of independent real data can be acquired by transforming the waveform signal in time blocks each of M pieces of sample data (will be referred to as “transform block” hereinafter). Normally, for reduction of the distortion of connection between transform blocks, 1M pieces of sample data of one of transform blocks next to each other are arranged to overlap 1M pieces of sample data of the other transform block. Thus, the DFT or DCT will be able to provide M pieces of real data from a mean number (M-M1) of sample data. Therefore, the M pieces of real sample data will subsequently be quantized and coded.
On the other hand, when the MDCT is used for spectrum transform, M pieces of independent real data can be acquired from 2M pieces of samples of which M pieces at ends of adjoining transform blocks, opposite to each other, are arranged to overlap each other. More specifically, when the MDCT is employed for the spectrum transform, M pieces of read data can be acquired from a mean number M of sample data, and the M pieces of real data will subsequently be quantized and coded. In the decoder, waveform elements acquired from codes acquired using the MDCT by making an inverse transform in each block are added together while being in interference with each other to reconstruct a waveform signal.
Generally, when a transform block intended for spectrum transform is made longer, the frequency resolution will be higher and the energy will concentrate to a certain spectrum signal component. Therefore, by making a spectrum transform with a large length of adjoining transform blocks, a half of sample data in one transform block being laid to overlap a half of sample data in the other transform block, and using the MDCT in such a manner that the number of spectrum signal components thus acquired will not be larger than the number of sample data on an original time base, it is possible to code an audio signal with a higher efficiency than when the DFT or DCT is used for the same purpose. Also, by arranging adjoining transform blocks to overlap each other over a sufficiently large length thereof, it is possible to reduce the distortion of connection between transform blocks of a waveform signal. However, since the long transform blocks will lead to a necessity of more work areas for transforming, the increased length of transform blocks will be a problem to a more compact design of the reading means, etc. Especially, the longer transform blocks will lead to an increase of manufacturing costs when it is difficult to raise the degree of semiconductor integration.
As mentioned above, quantization of signal components divided into subbands by the filtration and spectrum transform makes it possible to control any band where a quantum noise takes place. Therefore, using the so-called masking effect, a high auditory efficiency can be attained.
The above-mentioned “masking effect” refers to a phenomenon that a loud sound will acoustically cancel a low one. With this effect, it is possible to acoustically conceal a quantum noise behind an original signal sound. Thus, even with the signal sound compre
Dorvil Richemond
Nolan Daniel A.
Sonnenschein Nath & Rosenthal
LandOfFree
Audio signal pitch adjustment apparatus and method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio signal pitch adjustment apparatus and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio signal pitch adjustment apparatus and method will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3148340