Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1999-02-25
2001-04-17
Hudspeth, David (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
Reexamination Certificate
active
06219636
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to an audio coding technique, and more particularly to a method of and an apparatus for coding audio pitch information and a program storage device readable by the audio pitch coding apparatus on which the audio pitch coding program is recorded.
2. Description of the Related Art
The pitch based on a long cycle correlation of an audio signal due to a cyclic characteristic of a vibration of a human vocal chord is extracted and coded in order to code the audio signal at a high efficiency. Namely, since waveforms similar to each other are repeated at a predetermined cycle determined by this pitch in the audio signal, it is possible to code the audio signal at a high efficiency by combining the audio coding technique with a short time period prediction based on a proximity correlation. In the CELP (Code Excited Linear Prediction) as a representative audio coding method, such a construction is employed that the content of an adaptive code book is used as a driving source of a past synthesis filter, is once reproduced, and the pitch is determined so as to minimize a perceptual weighted error power with the input signal. Thus, the pitch extraction is an indispensable element of the technique.
By the way, in the audio coding method such as the CELP, the input speech is divided into a plurality of frames, the coding process is performed for each of the frames, and each of the frames is further divided into a plurality of sub frames. The sub frame is a basic unit for the processes such as a vector quantization process and the like. Then, the above mentioned pitch extraction is performed such that respective one of the pitches is calculated for each of the sub frames, and this calculated pitch is code-processed within a range of one or a plurality of frames. Here, upon coding the calculated pitch, although it is possible to code the value of the calculated pitch itself with respect to each of the sub frames in one frame, it is effective to code the value of the calculated pitch itself with respect to only one sub frame at the head in each frame and to code the difference between the calculated pitch and that of the previous sub frame with respect to the subsequent sub frames in the frame, so as to reduce the data amount of coding.
However, the audio signal can be categorized into: a voiced sound, in which an input speech accompanying the vibration of a vocal chord exists; an unvoiced sound, in which only an input speech not accompanying the vibration of a vocal chord exists; and a silence in which an input speech does not exist. The audio pitch has a meaning with respect to the portion of the voiced sound. Thus, after judging into which condition the audio signal is categorized, the pitch coding process is not performed if the sub frame, which is the minimum unit for the process, is judged to be the unvoiced sound or the silence (i.e., other than the voiced sound). Accordingly, if the head of the sub frames in one frame is not judged to be the voiced sound, since the standard value for the difference to be obtained for the subsequent sub frames is not determined, the pitch coding process is not performed as for one whole frame. In this case, the reproduction signal is not outputted from the adaptive code book in the CELP or the like.
Therefore, in the above mentioned audio coding method, it is difficult to reduce the data amount for coding and to realize a fine pitch coding process with a high fidelity for the input speech. Especially, in case that one frame is rather long or in case that the number of sub frames in one frame is large, since such a possibility increases that the sub frame, which is not judged to be the voiced sound, is included in the frame, the quality of the audio coding process may be certainly degraded.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a coding method, a coding apparatus and a program storage device readable by the coding apparatus on which a coding program is recorded, which can code the pitch of an input speech with a high fidelity even in case that a sub frame which is not judged to be the voiced sound is included in one frame, without drastically increasing the data amount for coding.
The above object of the present invention can be achieved by a pitch coding method of calculating and coding a pitch of an input speech, which is divided into a plurality of frames which is further divided into a plurality of sub frames, for each of the sub frames. The pitch coding method is provided with: a calculating process of calculating a pitch of each of the sub frames included in one or a plurality of the frames; a judging process of judging whether or not the input speech included in each of the sub frames is a voiced sound accompanying a vibration of a vocal chord; a first coding process of (i) coding, if a head sub frame of the sub frames which includes a first input speech is judged to be the voiced sound, the calculated pitch of the head sub frame, and (ii) selecting and coding, if the head sub frame is not judged to be the voiced sound and a subsequent sub frame of the sub frames which is subsequent to the head sub frame is judged to be the voiced sound, one of standard pitch values set in advance for the head sub frame; and a second coding process of (i) calculating and coding, if a preceding sub frame of the sub frames which is preceding to the subsequent sub frame judged to be the voiced sound is judged to be the voiced sound, a difference between the calculated pitch of the preceding sub frame and the calculated pitch of the subsequent sub frame, and (ii) calculating and coding, if the preceding sub frame is not judged to be the voiced sound, a difference between the selected standard value and the calculated pitch of the subsequent sub frame.
According to the pitch coding method of the present invention, by the calculating process, a pitch of each of the sub frames included in one or a plurality of the frames is calculated. Then, by the judging process, it is judged whether or not the input speech included in each of the sub frames is a voiced sound accompanying a vibration of a vocal chord. Then, by the first coding process, the coding process with respect to the head sub frame is performed. Namely, if the head sub frame of the sub frames is judged to be the voiced sound, the calculated pitch of the head sub frame is coded. Alternatively, if the head sub frame is not judged to be the voiced sound and the subsequent sub frame is judged to be the voiced sound, one of standard pitch values set in advance for the head sub frame is selected and coded. Further, by the second coding process, the coding process with respect to the subsequent sub frame is performed. Namely, if the preceding sub frame is judged to be the voiced sound, the difference between the calculated pitch of the preceding sub frame and the calculated pitch of the subsequent sub frame is calculated and coded. Alternatively, if the preceding sub frame is not judged to be the voiced sound, the difference between the selected standard value and the calculated pitch of the subsequent sub frame is calculated and coded.
Therefore, since not only the calculated pitch itself but also the difference of the calculated pitch are coded by using the predetermined standard value in accordance with the judgement results for the voiced sound, even in case that the judgment results for the voiced sounds change within a plurality of sub frames in one frame, to which the pitch coding process is applied, it is possible to code the pitch by using the difference with a high fidelity, so that it is possible to code the pitch information while keeping its quality high and without drastically increasing the data amount for coding.
In one aspect of the pitch coding method of the present invention, in the first and second coding processes, the pitch or the difference with respect to the sub frame judged to be the voiced sound is coded by obtaining a delay, which minimizes a perceptu
Finnegan, Henderson Farabow, Garrett and Dunner L.L.P.
Hudspeth David
Pioneer Electronics Corporation
Storm Donald L.
LandOfFree
Audio pitch coding method, apparatus, and program storage... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio pitch coding method, apparatus, and program storage..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio pitch coding method, apparatus, and program storage... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2531043