Speech duration processing method and apparatus for Chinese...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S267000

Reexamination Certificate

active

06542867

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to a speech duration processing method and apparatus for deciding the speech duration of synthesized speech to obtain good sound quality.
2. Description of the Related Art
Using Chinese as an example, the synthesizing units used in a Chinese speech synthesizing system are generally classified into two types: (1) monosyllabic (408 kinds, not including the four tones); and (2) phonemes (including 21 Chinese phonetic consonants and 38 vowels). Regardless of whether monosyllables or phonemes are used as synthesizing units, some factors, such as the phonemes, tones, phrase construction, locations in phrases, locations in sentences, and the front and rear connected phonemes, of the synthesizing units appropriately decide the speech duration of each of the synthesizing units, and can have a large affect on the degree of natural likeness of synthesized speech.
A conventional speech duration processing apparatus for Chinese text-to-speech system has been disclosed in R.O.C. Patent Application No. 80100559, entitled “Speech Duration Processing Apparatus for Text-to-Speech System.”
FIG. 9
is a block diagram illustrating a speech duration processing apparatus for determining the speech duration according to the phonemes, tones and the locations in the sentence. As shown in
FIG. 9
,
110
denotes a memory portion for storing different data.
120
denotes a pinyin sentence input portion for inputting pinyin sentences of any length and formed from pinyin markers and tone markers.
130
denotes a syllable inspecting portion for inspecting syllables in the sentence inputted from the pinyin sentence input portion
120
with the use of the tone markers.
150
denotes a syllable-phoneme look-up memory portion for storing phonemes composed from each of the syllables.
140
denotes a phoneme inspecting portion for inspecting the phonemes in the inputted pinyin sentence with the use of the syllable-phoneme look-up memory portion
150
, and for inspecting the location of each phoneme in the sentence.
170
denotes a speech duration numerical data storage portion for storing speech duration count data defined according to class of the phoneme, tone of the phoneme, and location of the phoneme in the sentence.
160
denotes a speech duration inspecting portion for calculating a syllable speech duration by using the inspected phoneme designated number, tones of each of the phonemes and locations of each of the phonemes in the sentence as indexing keys to retrieve the speech duration numerical data of each of the phonemes from the speech duration count data storage portion
170
.
In the aforesaid conventional speech duration processing apparatus, only the phonemes, tones and locations of the phonemes in the sentence are considered. As to whether or not the synthesizing units form phrases and the effect of the locations thereof in phrases on the speech duration should be considered as well. For example, in a three-character phrase, the speech duration of the second character in the phrase is the shortest, followed by that of the first character, and the speech duration of the third character is the longest. In the example
,
,
,
,
forms a three-character phrase. The speech duration generated by the conventional speech duration processing apparatus for the first
character and the second
character is about 339 ms. However, the speech duration for natural language pronunciation as measured with the use of a sound registering instrument are 275 and 302 ms, respectively, thereby arising in a relatively large difference. Thus, the speech duration obtained by mere consideration of the phonemes, tones and the locations of the phonemes in the sentence are inaccurate and will result in lowering of the synthesized speech quality.
SUMMARY OF THE INVENTION
Therefore, the main object of the present invention is to provide a speech duration processing method and apparatus for Chinese text-to-speech system capable of overcoming the aforesaid drawback.
According to a first aspect of the invention, a speech duration processing method for Chinese text-to-speech system using Chinese phonemes as a basic processing unit, comprises:
constructing a dictionary for storing Chinese vocabulary and corresponding information, such as phonetic markers, parts of speech, expansion syntax, etc.;
constructing a syllable-phoneme look-up portion for storing information, such as phoneme designated numbers (including consonant designated numbers and vowel designated numbers) corresponding to each syllable for all of the Chinese syllables, etc.;
constructing a basic speech duration storage portion for storing basic speech duration information classified according to phonemes;
constructing a speech duration parameter storage portion for storing speech duration parameters according to tones of the syllables to which each of the phonemes belong, the phrase construction and the locations in the phrases, the locations in the sentence, and the class of the connected phonemes;
inspecting positions of the syllables of each vocabulary in an input sentence of any length by comparing with the vocabulary stored in the dictionary;
generating a phonetic representation of each syllable of each inspected vocabulary according to the phonetic markers stored in the dictionary;
inspecting the part of speech and the expansion syntax of each inspected vocabulary with reference to the dictionary;
combining the vocabulary in the sentence into phrases according to the expansion syntax and relationship of the parts of speech of adjacent ones of the vocabulary;
inspecting each syllable in the generated text phonetic markers with the use of tone markers;
inspecting the phoneme formation of each inspected syllable with reference to the information in the syllable-phoneme look-up portion;
retrieving the speech duration of each inspected phoneme from the basic speech duration storage portion; and
calculating the speech duration of each of the inspected phonemes that form each of the inspected syllables from the basic speech duration and the parameters associated with the tones, the phrase construction, the locations in the phrases, the locations in the sentence, and the class of the front and rear adjacent phonemes of the inspected phonemes, and tallying the speech duration of the inspected phonemes to obtain the speech duration of each of the inspected syllables.
According to a second aspect of the invention, a speech duration processing method for Chinese text-to-speech system using Chinese syllables as a basic processing unit, comprises:
constructing a dictionary for storing Chinese vocabulary and corresponding information, such as phonetic markers, parts of speech, expansion syntax, etc.;
constructing a basic speech duration storage portion for storing basic speech duration information classified according to the syllables;
constructing a speech duration parameter storage portion for storing speech duration parameters according to tones of each of the syllables, the phrase construction and the locations in the phrases, the locations in the sentence, and the class of the connected syllables;
inspecting positions of the syllables of each vocabulary in an input sentence of any length by comparing with the vocabulary stored in the dictionary;
generating a phonetic representation of each syllable of each inspected vocabulary according to the phonetic markers stored in the dictionary;
inspecting the part of speech and the expansion syntax of each inspected vocabulary with reference to the dictionary;
combining the vocabulary in the sentence into phrases according to the expansion syntax and relationship of the parts of speech of adjacent ones of the vocabulary;
inspecting each syllable in the generated text phonetic markers with the use of tone markers;
retrieving the speech duration of each inspected syllable from the basic speech duration storage portion; and
calculating the speech duration of each of the inspected syllables from the basic speech duration and the parameters associated wit

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Speech duration processing method and apparatus for Chinese... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Speech duration processing method and apparatus for Chinese..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Speech duration processing method and apparatus for Chinese... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3052001

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.