Pattern recognition system

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06195638

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally relates to pattern recognition systems, and in particular, to pattern recognition systems using a weighted cepstral distance measure.
BACKGROUND OF THE INVENTION
Pattern recognition systems are used, for example, for the recognition of characters and speech patterns.
Pattern recognition systems are known which are based on matching the pattern being tested against a reference database of pattern templates. The spectral distance between the test pattern and the database of reference patterns is measured and the reference pattern having the closest spectral distance to the test pattern is chosen as the recognized pattern.
An example of the prior art pattern recognition system using a distance measure calculation is shown in
FIGS. 1
,
2
and
3
, to which reference is now made.
FIG. 1
is a flow chart illustrating the prior art pattern recognition system for speech patterns using a conventional linear predictor coefficient (LPC) determiner and a distance calculator via dynamic time warping (DTW).
FIG. 2
illustrates the relationship between two speech patterns A and B, along i-axis and j-axis, respectively.
FIG. 3
illustrates the relationship between two successive points of pattern matching between speech patterns A and B.
Referring to
FIG. 1
, the audio signal
10
being analyzed, has within it a plurality of speech patterns. Audio signal
10
is digitized by an analog/digital converter
12
and the endpoints of each speech pattern are detected by a detector
14
. The digital signal of each speech pattern is broken into frames and for each frame, analyzer
16
computes the linear predictor coefficients (LPC) and converts them to cepstrum coefficients, which are the feature vectors of the test pattern. Reference patterns, which have been prepared as templates, are stored in a database
18
. A spectral distance calculator
20
uses a dynamic time warping (DTW) method to compare the test pattern to each of the reference patterns stored in database
18
. The DTW method measures the local spectral distance between the test pattern and the reference pattern, using a suitable method of measuring spectral distance, such as the Euclidean distance between the cepstral coefficients or the weighted cepstral distance measure. The template whose reference pattern is closest in distance to the analyzed speech pattern, is then selected as being the recognized speech pattern.
In a paper, entitled “Dynamic Programming Algorithm Optimization for Spoken Word Recognition”, published by the
IEEE Transactions on Acoustics, Speech and Signal Processing
in February 1978, Sakoe and Chiba reported on a dynamic programming (DP) based algorithm for recognizing spoken words. DP techniques are known to be an efficient way of matching speech patterns. Sakoe and Chiba introduced the technique known as “slope constraint”, wherein the warping function slope is restricted so as to discriminate between words in different categories.
Numerous spectral distance measures have been proposed including the Euclidean distance between cepstral coefficients which is widely used with LPC-derived cepstral coefficients. Furui in a paper, entitled “Cepstral Analysis Techniques for Automatic Speaker Verification”, published by the
IEEE Transactions on Acoustics, Speech and Signal Processing
in April 1981, proposed a weighted cepstral distance measure which further reduces the percentage of errors in recognition.
In a paper, entitled “A Weighted Cepstral Distance Measure for Speech Recognition”, published by the
IEEE Transactions on Acoustics, Speech and Signal Processing
in October 1987, Tahkura proposed an improved weighted cepstral distance measure as a means to improve the speech recognition rate.
Referring now to
FIG. 2
, the operation of the DTW method will be explained. In
FIG. 2
, speech patterns A and B are shown along the i-axis and j-axis, respectively. Speech patterns A and B are expressed as a sequence of feature vectors a
1
, a
2
, a
3
. . . a
m
and b
1
, b
2
, b
3
. . . b
m
, respectively.
The timing differences between two speech patterns A and B, can be depicted by a series of ‘points’ Ck(i,j). A ‘point’ refers to the intersection of a frame i from pattern A to a frame j of pattern B. The sequence of points C
1
, C
2
, C
3
. . . Ck represent a warping function
30
which effects a map from the time axis of pattern A, having a length m, on to the time axis of pattern B, having a length n. In the example of
FIG. 2
, function
30
is represented by points c
1
(
1
,
1
), c
2
(
1
,
2
), c
3
(
2
,
2
), c
4
(
3
,
3
), c
5
(
4
,
3
) . . . ck(n,m). Where timing differences do not exist between speech patterns A and B, function
30
coincides with the 45 degree diagonal line (j=i). The greater the timing differences, the further function
30
deviates from the 45 degree diagonal line.
Since function
30
is a model of time axis fluctuations in a speech pattern, it must abide by certain physical conditions. Function
30
can only advance forward and cannot move backwards and the patterns must advance together. These restrictions can be expressed by the following relationships:
i
(
k
)−
i
(
k−
1)≦1 and (
j
(
k
)−
j
(
k−
1)≦1; and
i
(
k−
1)≦
i
(
k
) and
j
(
k−
1)≦
j
(
k
).  (1)
Warping function
30
moves one step at a time from one of three possible directions. For example, to move from C
3
(
2
,
2
) to C
4
(
3
,
3
), function
30
can either move directly in one step from (
2
,
2
) to (
3
,
3
) or indirectly via the points at (
2
,
3
) or (
3
,
2
).
Function
30
is further restricted to remain within a swath
32
having a width r. The outer borders
34
and
36
of swath
32
are defined by (j=i+r) and (j=i−r), respectively.
A fourth boundary condition is defined by:
i(
1
)=1, j(
1
)=1, and i(end)=m, j(end)=n.  (
2
)
Referring now to
FIG. 3
, where, for example, the relationship between successive points C
10
(
10,10
) and C
11
(
11,11
), of pattern matching between speech patterns A and B is illustrated. In accordance with the conditions as described hereinbefore, there are three possible ways to arrive at point C
11
(
11,11
), that is, either directly from C
10
(
10,10
) to C
11
(
11,11
), indicated by line
38
or from C
10
(
10,10
) via point (
11,10
) to C
11
(
11,11
), indicated by lines
40
and
42
, or thirdly from C
10
(
10,10
) via point (
10,11
) to C
11
(
11,11
), indicated by lines
44
and
46
.
Furthermore, associated with each arrival point (i,j), such as point C
11
(
11,11
), is a weight W
ij
, such as the Euclidean or Cepstral distance between the ith frame of pattern A and the jth frame of pattern B. By applying a weight W
ij
to each of indirect paths
40
,
42
,
44
and
46
and a weight of 2W
ij
to direct path
38
, the path value S
ij
, at the point (ij) can be recursively ascertained from the equation:
S
ij
=
min

(
2

W
ij
+
S
i
-
1
,
j
-
1
,
W
ij
+
S
i
,
j
-
1
,
W
ij
+
S
i
-
1
,
j
)
(
3
)
In order to arrive at endpoint S
nm
, it is necessary to calculate the best path value S
ij
at each point. Row by row is scanned and the values of S
ij
for the complete previous row plus the values of the present row up to the present point are stored. The value for Snm is the best path value.
SUMMARY OF THE INVENTION
It is thus the general object of the present invention to provide an improved pattern recognition method, which is especially suitable for voice recognition.
According to the invention there is provided a method of dynamic time warping of two sequences of feature sets onto each other. The method includes the steps of creating a rectangular graph having the two sequences on its two axes, defining a swath of width r, where r is an odd number, centered about a diagonal line connecting the beginning point at the bottom left of the rectangle to the endpoint at the top right of the rectangle and also defining r−1 lines within the swath. The lines defining the swath are para

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pattern recognition system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pattern recognition system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pattern recognition system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2565751

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.