Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission
Reexamination Certificate
1998-10-29
2002-11-19
Banks-Harold, Marsha D. (Department: 2654)
Data processing: speech signal processing, linguistics, language
Speech signal processing
For storage or transmission
C704S267000, C704S278000, C704S504000
Reexamination Certificate
active
06484137
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to an audio reproducing apparatus which is capable of converting a value of an audio playback speed into a desired value and obtaining the resulting audio.
BACKGROUND OF THE INVENTION
In recent years, techniques for coding audio data with high efficiency, storing coded audio data in a storage medium, or transmitting the coded audio data over communication networks, have been put into practical use and widely utilized.
As for such techniques, apparatus for reproducing audio according to MPEG (Moving Picture Experts Group) as an international standard is disclosed in Japanese Published Patent Application No. Hei. 9-73299.
FIG. 19
is a block diagram showing this MPEG audio reproducing apparatus Hereinafter, a description is given of a prior art audio reproducing apparatus with reference to FIG.
19
.
Referring now to
FIG. 19
, an MPEG audio reproducing apparatus
1
comprises a reproducing speed detecting circuit
2
, an MPEG audio decoder
3
, a time-scale modification circuit
4
, a D/A converter
5
, and an audio amplifier
6
. The time-scale modification circuit
4
comprises a frame memory
34
, a time-scale modification unit
35
, a ring memory
32
, an up down counter
33
, and a read clock generating circuit
36
.
An MPEG audio stream which has been coded by the MPEG audio method is input to the MPEG audio reproducing apparatus
1
. The MPEG audio decoder
3
decodes the MPEG audio stream into an audio output of a digital signal. The MPEG audio method and formats are described in various kinds of references, including “ISO/IEC IS 11172 Part 3: Audio”.
Meanwhile, speed information such as double speed and 0.5 multiple speed is input to the reproducing speed detecting circuit
2
, which detects the speed information (reproducing speed) and generates a decoding clock. The decoding clock is supplied to the time-scale modification circuit
4
and the MPEG audio decoder
3
. An audio signal which has been decoded by the MPEG audio decoder
3
is input to the circuit
4
, where it is subjected to time-scale compression/expansion or unvoiced sound deletion/insertion based on the given speed information, whereby time-scale modification process is performed, and the resulting output is reproduced through a speaker
23
.
However, in the MPEG audio coding method which performs decoding frame by frame of a prescribed time length, data processing of plural frames requires numerous buffer memories and increases complexity, which causes a large-scale hardware structure.
Another apparatus for reproducing audio according to the MPEG is disclosed in Japanese Published Patent Application No. Hei 9-81189.
FIG. 20
is a block diagram showing this MPEG audio reproducing apparatus. Hereinafter, a description is given of another prior art audio reproducing apparatus with reference to FIG.
20
.
Referring to
FIG. 20
, reference numeral
1701
designates a first frame diving unit for dividing an input subband signal
1
and holding a signal of one frame of a Tf sample length, reference numeral
1702
designates a second frame diving unit for dividing an input subband signal
2
and holding a signal of one frame of a Tf sample length, reference numeral
1703
designates a third frame diving unit for dividing an input subband signal
3
and holding a signal of one frame of a Tf sample length, and reference numeral
1704
designates a fourth frame diving unit for dividing an input subband signal
4
and holding a signal of one frame of a Tf sample length.
The input subband signals
1
-
4
are subband signals of four subbands divided by a filter bank which divides a normal time-scale signal into four subband signals by ¼ downsampling. Assume that the subband signal
1
is the lowest subband signal and the subband signal
4
is the highest subband signal.
Reference numeral
1710
designates a correlation function calculating unit which calculates correlation values S(n) in an overlapping portion of n samples of first half and second half signals of a subband signal of a subband containing audio pitch components, and which detects a maximum value n of the correlation values S(n) as “Tc”. Reference numeral
1711
designates a reproducing speed detecting unit which detects specification of a reproducing speed F by an auditor. Reference numeral
1712
designates a correlation function detection range control unit which limits a correlation function detection range. Reference numeral
1705
designates a first cross fading unit which performs cross fading process to overlapped Tc samples of the first half and second half signals of the subband signal divided and held by the first frame dividing unit
1701
. Reference numeral
1706
designates a second cross fading unit which performs cross fading process to overlapped Tc samples of the first half and second half signals of the subband signal divided and held by the second frame dividing unit
1702
. Reference numeral
1707
designates a third cross fading unit which performs cross fading process to overlapped Tc samples of the first half and second half signals of the subband signal divided and held by the third frame dividing unit
1703
. Reference numeral
1708
designates a fourth cross fading unit which performs cross fading process to overlapped Tc samples of the first half and second half signals of the subband signal divided and held by the fourth frame dividing unit
1704
. Reference numeral
1709
designates a synthesizing filterbank which synthesizes subband signals of four subbands which have been subjected to cross fading process.
FIG. 21
is a diagram showing time-scale waveform of one frame of a frequency band which contains main pitch components of an audio signal.
FIG. 22
is a diagram showing two segments of the first half and second half signals into which one frame signal in
FIG. 21
has been divided, as upper and lower segments.
FIG. 23
is a graph showing values of a correlation function between the two segments in FIG.
22
.
FIG. 24
is a diagram qualitatively showing a state in which the segment of the second half signal component is shifted to a time when the correlation function takes the maximum value.
FIGS.
25
(
a
)-
25
(
c
) are diagrams showing a case where cross fading process is performed with two segments overlapped for a Tc time period.
Subsequently, a description is given of operation of the reproducing apparatus so constructed with reference to FIGS.
21
through
25
(
a
)-
25
(
c
).
First of all, suppose that data of one frame (Tf sample length) of the input subband signal
1
includes main pitch components of the audio signal as shown in FIG.
21
. The one frame data is divided into two segments which are equal in the number of data as shown in FIG.
22
and held by the first frame dividing unit
1701
. In a like manner, the subband signals
2
,
3
, and
4
are respectively divided into two segments and held by the second, third, and fourth frame dividing units
1702
,
1703
, and
1704
, respectively.
Then, from a target speed rate F obtained by the reproducing speed detecting unit
2
, a data length of an overlapping portion of the two segments, i.e., a target overlapping value Tb is found according to the following equation:
Tb=Tf
·(1−1
F
)
Considering a correction parameter B (initialization value=0) for correcting deviation from the target speed rate F due to phase adjustment mentioned later, the correlation function calculating unit
1710
calculates correlation in a range of m samples before and m samples after an overlapping interval data length (Tb+B) of two segments in the first frame dividing apparatus
1701
, to find an overlapping interval length Tc where the correlation function takes the maximum value. Then, to correct the error between the target speed rate F and an actual speed rate resulting from difference between Tc and Tb, a value of the correction parameter B is updated as follows:
B ←B+Tb−Tc
In
FIG. 22
, there is shown a case where two upper and lower segments are disposed separately, by set
Matsumoto Michio
Misaki Masayuki
Tagawa Junichi
Taniguchi Hirotsugu
Banks-Harold Marsha D.
Matsushita Electric - Industrial Co., Ltd.
Storm Donald L.
Wenderoth , Lind & Ponack, L.L.P.
LandOfFree
Audio reproducing apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Audio reproducing apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Audio reproducing apparatus will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2972538