Mobile phone having speaker dependent voice recognition...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Mobile phone having speaker dependent voice recognition... Mobile phone having speaker dependent voice recognition...

: 1999-03-01
: 2001-07-10
: Korzuch, William R. (Department: 2641)
: Data processing: speech signal processing, linguistics, language
: Speech signal processing
: Recognition

: C704S239000, C704S243000, C704S247000, C379S088060
: Reexamination Certificate
: active
: 06260012
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a technique for improved speech recognition in a telephone, particularly a mobile telephone, such as for automatic hands-free dialing.
2. Description of the Related Art
In recent years, telephones have become equipped with optional speech recognition circuitry to enable special hands-free functions to be carried out, such as automatic hands-free dialing. In the mobile phone environment, hands-free dialing by speech recognition is particularly useful to enable users to place calls while driving by reciting a name or number of a party to be called (called party). The mobile phone is equipped with a speech recognition circuit to convert the user's speech into audio feature data. Typically, the feature data is compared to different sets of pre-stored feature data corresponding to names previously recorded by the user during a registration process. If a match is found, the number corresponding to the name is automatically dialed.
According to a conventional speech recognition method applied to a Code Division Multiple Access (CDMA) mobile phone or the like, a match between the user's current speech and a pre-recorded called party name is established by comparing the current feature data (corresponding to the current speech) with each set of pre-stored feature data to determine the most similar data set. If the difference between the most similar data set and the current feature data is below a predetermined threshold, then the most similar data set is determined to match the current speech. Once a match is established, the telephone number of the called party corresponding to the most similar data set may be automatically dialed. On the other hand, if the difference is above the threshold, a matching condition will not be established. Note that a match will be made between a wrong called party if the wrong called party's feature data happens to be closest to the current feature data, with differences below the threshold. Another problem may occur when more than one recorded feature data set is highly similar to current feature data, with differences between each highly similar set and the current data less than the threshold. In this case, the user may be prompted to repeat the utterance or perform some other task to identify which called party name is intended.
The above approach of utilizing a fixed threshold (or thresholds) for determining whether an input utterance matches a pre-recorded name, ignores the fact that varying environmental conditions such as inherent features of pronounced vocal data, personal differences in pronunciation, etc., may be present at any given time. Consequently, a false recognition or a recognition error may be caused, resulting in an undesired party being called or excessive non-recognition of utterances.
One example of a prior art technique designed to increase the success rate of hands-free dialing using speech recognition is presented in U.S. Pat. No. 5,640,485. In this patent, when an utterance is determined to be outside a predetermined closeness threshold to all pre-recorded words, then the user is prompted to repeat the utterance, and a new closeness threshold is computed based on the pair of utterances. While this technique may have some benefit in improving dialing success rates, the repetition requirement is an inconvenience to the user.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide an apparatus and method for achieving more reliable and effective speech recognition in a communication terminal.
To achieve the above and other objects, there is disclosed in one aspect of the invention an apparatus and method for performing improved speech recognition in a communication terminal, e.g., a mobile phone with a hands-free voice dialing function. In a speech recognition mode, a user's input speech such as a desired called party name, number or a phone command, is converted to feature data and compared to a number of individual pre-stored feature data sets corresponding to pre-recorded speech obtained during a registration process. Difference values representing the respective differences (or similarity) between the current user's input speech and the respective data sets are computed. A first closest (most similar) and second closest feature data set correspond to the first smallest and second smallest difference values so obtained. A new closeness threshold is computed as the sum of a small, predetermined threshold and a differential value between the first and second difference values. If the first difference value is less than the computed closeness threshold, then the input speech is determined to match the first feature data set, whereby a positive speech recognition result is obtained. When a match occurs, an automatic dialing operation may be carried out in one application. Advantageously, the dynamic computation of a new closeness threshold based on the first and second difference values improves the success rate in matching input speech with stored speech.
In another aspect of the invention, an apparatus for decision of voice recognition data in a cellular phone with a voice recognition dialing function includes: a memory having a first region for registration of feature data with respect to an input voice, a second region for storing a number of trials upon every recognition with respect to the feature data, a third region for storing an accumulative mean value with respect to a series of threshold values obtained from a corresponding number of trials, stored in the second region to and through the preceding number of trials, and a fourth region for storing a specified threshold value; a vocoder for generating packet data according to an input voice; a voice recognition means for analyzing the packet data currently provided from the vocoder to thereby generate corresponding feature data, comparing the generated feature data with feature data of reference voices pre-registered in the memory to thereby search any similar data, and if it is searched the similar data, then outputting an index of the searched feature data and a difference value between the generated feature data and the registered feature data; and a controller for comparing the difference value outputted from the voice recognition means with a predetermined threshold value, so that if the difference value is less than the threshold value, then the feature data corresponding to the index are read out from the memory and delivered to the vocoder, calculating an accumulative mean value of threshold values for every trial of recognition with respect to the feature data to and through the present time, the accumulative mean value being stored in the third region of the memory, and by reflecting the accumulative mean value into the threshold value, updating the threshold value stored in the fourth region of the memory.

REFERENCES:
patent: 4797929 (1989-01-01), Gerson et al.
patent: 4905288 (1990-02-01), Gerson et al.
patent: 5371779 (1994-12-01), Kobayashi
patent: 5640485 (1997-06-01), Ranta
patent: 5991364 (1999-11-01), McAllister et al.
patent: 6003004 (1999-12-01), Hershkovits et al.
patent: 6134527 (2000-10-01), Meunier et al.

Affiliated with

Park Joung-Kyou

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Dilworth & Barrese LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Korzuch William R.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

McFadden Susan

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Samsung Electronics Co,. Ltd

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Mobile phone having speaker dependent voice recognition... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Mobile phone having speaker dependent voice recognition..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mobile phone having speaker dependent voice recognition... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2535298

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure