Method and system for preselection of suitable units for...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S258000, C704S266000

Reexamination Certificate

active

06684187

ABSTRACT:

TECHNICAL FIELD
The present invention relates to a system and method for increasing the speed of a unit selection synthesis system for concatenative speech synthesis and, more particularly, to predetermining a universe of phonemes—selected on the basis of their triphone context—that are potentially used in speech. Real-time selection is then performed from the created phoneme universe.
BACKGROUND OF THE INVENTION
A current approach to concatenative speech synthesis is to use a very large database for recorded speech that has been segmented and labeled with prosodic and spectral characteristics, such as the fundamental frequency (F
0
) for voiced speech, the energy or gain of the signal, and the spectral distribution of the signal (i.e., how much of the signal is present at any given frequency). The database contains multiple instances of speech sounds. This multiplicity permits the possibility of having units in the database that are much less stylized than would occur in a diphone database (a “diphone” being defined as the second half of one phoneme followed by the initial half of the following phoneme, a diphone database generally containing only one instance of any given diphone). Therefore, the possibility of achieving natural speech is enhanced with the “large database” approach.
For good quality synthesis, this database technique relies on being able to select the “best” units from the database—that is, the units that are closest in character to the prosodic specification provided by the speech synthesis system, and that have a low spectral mismatch at the concatenation points between phonemes. The “best”sequence of units may be determined by associating a numerical cost in two different ways. First, a “target cost” is associated with the individual units in isolation, where a lower cost is associated with a unit that has characteristics (e.g., F
0
, gain, spectral distribution) relatively close to the unit being synthesized, and a higher cost is associated with units having a higher discrepancy with the unit being synthesized. A second cost, referred to as the “concatenation cost”, is associated with how smoothly two contiguous units are joined together. For example, if the spectral mismatch between units is poor, perhaps even corresponding to an audible “click”, there will be a higher concatenation cost.
Thus, a set of candidate units for each position in the desired sequence can be formulated, with associated target costs and concatenative costs. Estimating the best (lowest-cost) path through the network is then performed using a Viterbi search. The chosen units may then be concatenated to form one continuous signal, using a variety of different techniques.
While such database-driven systems may produce a more natural sounding voice quality, to do so they require a great deal of computational resources during the synthesis process. Accordingly, there remains a need for new methods and systems that provide natural voice quality in speech synthesis while reducing the computational requirements.
SUMMARY OF THE INVENTION
The need remaining in the prior art is addressed by the present invention, which relates to a system and method for increasing the speed of a unit selection synthesis system for concatenative speech and, more particularly, to predetermining a universe of phonemes in the speech database, selected on the basis of their triphone context, that are potentially used in speech, and performing real-time selection from this precalculated phoneme universe.
In accordance with the present invention, a triphone database is created where for any given triphone context required for synthesis, there is a complete list, precalculated, of all the units (phonemes) in the database that can possibly be used in that triphone context. Advantageously, this list is (in most cases) a significantly smaller set of candidates units than the complete set of units of that phoneme type. By ignoring units that are guaranteed not to be used in the given triphone context, the selection process speed is significantly increased. It has also been found that speech quality is not compromised with the unit selection process of the present invention.
Depending upon the unit required for synthesis, as well as the surrounding phoneme context, the number of phonemes in the preselection list will vary and may, at one extreme, include all possible phonemes of a particular type. There may also arise a situation where the unit to be synthesized (plus context) does not match any of the precalculated triphones. In this case, the conventional single phoneme approach of the prior art may be employed, using the complete set of phonemes of a given type. It is presumed that these instances will be relatively infrequent.


REFERENCES:
patent: 5659664 (1997-08-01), Kaja
patent: 5794197 (1998-08-01), Alleva et al.
patent: 5978764 (1999-11-01), Lowry et al.
patent: 6041300 (2000-03-01), Ittycheriah et al.
patent: 6163769 (2000-12-01), Acero et al.
patent: 6173263 (2001-01-01), Conkie
patent: 6317712 (2001-11-01), Kao et al.
patent: 6366883 (2002-04-01), Campbell et al.
patent: 2001/0044724 (2001-11-01), Hon et al.
patent: 0942409 (1999-09-01), None
patent: 0942409 (2000-01-01), None
patent: 2313530 (1997-11-01), None
patent: 06095696 (1994-04-01), None
patent: WO 00/30069 (2000-05-01), None

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for preselection of suitable units for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for preselection of suitable units for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for preselection of suitable units for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3223824

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.