System and method for enhancing speech and pattern...

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S222000

Reexamination Certificate

active

06795804

ABSTRACT:

BACKGROUND
1. Technical Field
This application relates generally to speech and pattern recognition and, more specifically, to multi-category (or class) classification of an observed multi-dimensional predictor feature, for use in pattern recognition systems.
2. Description of Related Art
In one conventional method for pattern classification and classifier design, each class is modeled as a gaussian, or a mixture of gaussian, and the associated parameters are estimated from training data. As is understood, each class may represent different data depending on the application. For instance, with speech recognition, the classes may represent different phonemes or triphones. Further, with handwriting recognition, each class may represent a different handwriting stroke. Due to computational issues, the gaussian models are assumed to have a diagonal co-variance matrix. When classification is desired, a new observation is applied to the models within each category, and the category, whose model generates the largest likelihood is selected.
In another conventional design, the performance of a classifier that is designed using gaussian models is enhanced by applying a linear transformation of the input data, and possibly, by simultaneously reducing the feature dimension. More specifically, conventional methods such as Principal Component Analysis, and Linear Discriminant Analysis may be employed to obtain the linear transformation of the input data. Recent improvements to the linear transform techniques include Heteroscedastic Discriminant Analysis and Maximum Likelihood Linear Transforms (see, e.g., Kumar, et al., “Heteroscedastic Discriminant Analysis and Reduced Rank HMMs For Improved Speech Recognition,”
Speech Communication
, 26:283-297, 1998).
More specifically,
FIG. 1
a
depicts one method for applying a linear transform to an observed event x. With this method, a precomputed n×n linear transformation, &thgr;
T
, is multiplied by an observed event x (an n×1 feature vector), to yield and n×1 dimensional vector, y. The vector y is modeled as a gaussian vector with a mean u
j
and variances &Sgr;
j
for each different class. The same y is used and a different mean and variance is assigned for each different class to model that same y. The variances for each class are assumed to be diagonal covariance matrices.
In another conventional method depicted in
FIG. 1
b
, instead of a single linear transformation &thgr;
T
(as in
FIG. 1
a
), a plurality of linear transformation matrices &thgr;
1
T
, &thgr;
2
T
are implemented, as long as the value of the determinant is constrained to be “1” (unity). Then one transformation is applied for one set of classes, and other to another set of classes. With this method, each class may have its own linear transformation &thgr;, or two or more classes may share the same linear transformation &thgr;.
SUMMARY OF THE INVENTION
The present invention is directed to a system and method for applying a linear transformation to classify and input event. In one aspect, a method for classification comprises the steps of:
capturing an input event;
extracting an n-dimensional feature vector from the input event;
applying a linear transformation to the feature vector to generate a pool of projections;
utilizing different subsets from the pool of projections to classify the feature vector; and
outputting a class identity associated with the feature vector.
In another aspect, the step of utilizing different subsets from the pool of projections to classify the feature vector comprises the steps of:
for each predefined class, selecting a subset from the pool of projections associated with the class;
computing a score for the class based on the associated subset; and
assigning, to the feature vector, the class having the highest computed score.
In yet another aspect, each of the associated subsets comprise a unique predefined set of n indices computed during training, which are used to select the associated components from the computed pool of projections.
In another aspect, a preferred classification method is implemented in Gaussian and/or maximum-likelihood framework.
The novel concept of applying projections is different from the conventional method of applying different transformations because the sharing is at the level of the projections. Therefore, in principle, each class (or large number of classes) may use different “linear transforms”, although the difference between such transformations may arise from selecting a different combination of linear projections from a relatively small pool of projections. This concept of applying projections can advantageously be applied in the presence of any underlying classifier.
These and other aspects, features and advantage of the present invention will be described and become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.


REFERENCES:
patent: 4908865 (1990-03-01), Doddington et al.
patent: 5054083 (1991-10-01), Naik et al.
patent: 5278942 (1994-01-01), Bahl et al.
patent: 5754681 (1998-05-01), Watanabe et al.
patent: 6131089 (2000-10-01), Campbell et al.
patent: 2001/0019628 (2001-09-01), Fujimoto et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for enhancing speech and pattern... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for enhancing speech and pattern..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for enhancing speech and pattern... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3263576

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.