Method of phonetic modeling using acoustic decision tree

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method of phonetic modeling using acoustic decision tree Method of phonetic modeling using acoustic decision tree

: 1999-01-21
: 2001-11-13
: Dorvil, Richemond (Department: 2741)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S257000
: Reexamination Certificate
: active
: 06317712
: ABSTRACT:

FIELD OF INVENTION
This invention relates to phonetic modeling of speech and more particularly to phonetic modeling using acoustic decision trees.
BACKGROUND OF INVENTION
Although there are very few phones in a language, modeling those few phones is not sufficient for speech recognition purpose. The coarticulation effect makes the acoustic realization of the same phone in different context very different. For example, English has about 40 to 50 phones, Spanish has a little more than 20 phones. Training only 50 phonetic models for English is not sufficient to cover all the coarticulation effects. Context-dependent models are considered for the speech recognition purpose because of this reason. Context-dependent phonetic modeling has now become standard practice to model variations seen in the acoustics of a phone caused by phonetic context. However, if only immediate contexts are considered, there are 50
30
=125,000 models to be trained, this large number of models defeats the motivation of using phonetic models in the first place. Fortunately, some contexts will result in large acoustic difference, some will not. Therefore, the phonetic models can be clustered to not just reduce the number of models but also increase the training robustness.
The art of figuring out how to cluster phonetic models is one of the core research areas in the speech community for large vocabulary speech recognition. The clustering algorithm needs to achieve the following three goals: 1) maintaining the high acoustic resolution while achieving the most clustering, 2) all the clustered units can be well trainable with the available speech data and 3) being able to predict unseen contexts with the clustered models. Decision tree clustering using phonological rules has been shown to achieve the above objectives. See for example D. B. Paul, “Extensions to Phone-state Decision-tree Clustering: Single Tree and Tagged Clustering,” Proc. ICASSP 97, Munich, Germany, April 1997.
Previously, applicant reported on FeaturePhones, a phonetic context clustering method which defines context in articulatory features, and clusters the context at the phone level using decision trees. See Y. H. Kao et al. “Toward Vocabulary Independent Telephone Speech Recognition,” ICASSP 1994, Vol. 1, pgs. 117-120 and K. Kondo et al. “Clustered Interphase or Word Context-Dependent Models for Continuously Read Japanese,” Journal of Acoustical Society of Japan, Vol. 16, No. 5, pgs. 299-310, 1995. This proved to be an efficient clustering method when the training data was scarce, but was too restrictive to take advantage of significantly more training data.
SUMMARY OF INVENTION
In accordance with one embodiment of the present invention, a method of phonetic modeling that applies a decision tree algorithm to an acoustic level by the steps of training baseform monophone models, training all triphone models present in the training corpus, with monophone as seeds for each center phone, splitting the root node into two descendant nodes, repeating the splitting procedure on all leaf and clustering the leaves of tree or averaging the models in the cluster to obtain seed models for each cluster.

REFERENCES:
patent: 5388183 (1995-02-01), Lynch
patent: 5745649 (1998-04-01), Lubensky
patent: 5794197 (1998-08-01), Alleva et al.
patent: 5812975 (1998-09-01), Komori et al.
patent: 6006186 (1999-12-01), Chen et al.
ICASSP-93. Alleva et al., “Predicting unseen triphones with senones” PP 311-314, vol. 2. Apr. 1993.*
ICSLP 96. International Conference on Spoken Language, 1996. Aubert et al., “A bottom-up approach for handling unseen triphones in vocabulary continuous speech recognition” PP 14-17 vol. 1. Oct. 199.

Affiliated with

Kao Yu-Hung

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kondo Kazuhiro

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Dorvil Richemond

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Telecky Jr Frederick J.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Texas Instruments Incorporated

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Troike Robert L.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of phonetic modeling using acoustic decision tree does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of phonetic modeling using acoustic decision tree, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of phonetic modeling using acoustic decision tree will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2611810

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure