Data processing: speech signal processing – linguistics – language – Speech signal processing – Synthesis
Reexamination Certificate
1998-08-21
2001-05-29
Korzuch, William R. (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Synthesis
C704S258000
Reexamination Certificate
active
06240390
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATION
This application claims the priority benefit of Taiwan application serial no. 87107658, filed May 18, 1998, the non-essential material of which is incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention includes speech synthesizers, and more particularly, to an architecture for speech synthesizer and a method to synthesize speech, which allows the speech synthesizer to be capable of driving external devices in a multi-tasking manner while nonetheless allowing the software complexity and voice concatenation to be simple to implement.
2. Description of Related Art
A synthesizer may be a device that combines a variety of items so as to form a new, complex product. Speech synthesizers are widely utilized in various systems where voice is used to output certain messages or data to the user, such as personal computers, mobile phones, toys, and warning systems, to name a few. A speech synthesizer is typically provided with a ROM (read-only memory) unit which stores a database of various sounds or words that can be retrieved and combined to form a stream of voices of specific meanings. This ROM unit is typically partitioned into a number of sections, called speech sections. In one standard for voice synthesizing, such speech sections are designated by H
4
, S
1
, S
2
, . . . , S
n
. and T
4
. Each speech section represents one of 250 basic phonic elements that can be selected and combined into the sound data of various words or phrases. Alternatively, each speech section can store the sound data of complete words. However, this is merely a design choice by the speech synthesizer designer.
The data in each speech section can be selected for synthesizing into words or phrases through various speech equations (EQ), each EQ representing the combination of a number of selected phonic elements that are combined in accordance with the EQ to form a particular word or phrase of a specified meaning. For example, EQ=H
4
+S
1
+S
2
+S
3
+T
4
may represent either a five-sound word or a five-word phrase.
The foregoing scheme of using phonic elements for the synthesizing of words allows the required memory space for the speech database to be significantly reduced as compared to the scheme of storing the sound of each word in the ROM unit. Moreover, it allows the designer to be more flexible and versatile in designing the speech synthesizer for the purpose of providing the sound data of more complex words or phrases.
One standard for speech synthesis defines one section of speech data as the combination of a number of bytes, respectively designated by H
4
, S
1
, S
2
, S
3
, and T
4
. This scheme is illustratively depicted in FIG.
1
. Each of the bytes (H
4
, S
1
, S
2
, S
3
, T
4
) represents one basic constituent element of sound data and can be either a single sound, a series of sounds, a piece of music, or the combination of several pieces of music.
FIG. 2
is a schematic block diagram showing a conventional speech synthesizer, as designated by the reference numeral
10
, that can be used for the synthesizing of the speech data shown in
FIG. 1
into digital sound data. As shown, this speech synthesizer
10
includes a memory unit
11
, such as a ROM unit, and a synthesizer
12
. The ROM unit
11
is used to store a database of phonic elements and various other kinds of speech data that can be selectively retrieved for synthesizing into sound data of specific meanings. When the speech synthesizer
10
receives a trigger signal
14
, the corresponding phonic elements in the ROM unit
11
are retrieved and then transferred to the synthesizer
12
for synthesizing into sound data. The synthesized sound data are then converted into audible sounds by a loudspeaker
13
. One benefit of this speech synthesizer is that its system architecture is quite simple to implement.
One drawback to the foregoing speech synthesizer
10
, however, is that it is only capable of outputting the synthesized speech data as audible sounds through the loudspeaker
13
, but incapable of driving external devices such as motors or light-emitting diodes (LED) in a multi-tasking manner at the same time.
The synthesizer
12
utilized in the speech synthesizer
10
is typically included in a state machine that can perform some I/O controls. One drawback to the utilization of the speech synthesizer in state machine, however, is that the I/O ports thereof can be switched for other I/O functions only when at the break between two consecutive speech sections. Therefore, the architecture of
FIG. 2
would not meet high quality requirements for speech synthesizers.
FIG. 3A
is a schematic block diagram of a conventional speech synthesizer
20
with multi-tasking capability. As shown, this speech synthesizer
20
includes a memory unit
21
such as a ROM unit, a micro-controller
22
, a synthesizer
23
, and a digital-to-analog converter (DAC)
24
. Moreover, the speech synthesizer
20
is coupled to a loudspeaker
25
. The memory unit
21
is used to store a database of phonic elements and various other kinds of speech data that can be selectively retrieved for synthesizing into sound data of specific meanings. When the speech synthesizer
20
receives a trigger signal
27
, the corresponding data are retrieved under control of the micro-controller
22
from the memory unit
21
and subsequently transferred to the synthesizer
23
for synthesizing into sound data of specific meanings. The digital output from the synthesizer
23
is then converted by the DAC
24
into analog form which is then converted by the loudspeaker
25
into audible form. The micro-controller
22
allows the speech synthesizer
20
to perform I/O functions with external devices such as motors or LEDs.
Alternatively, as shown in
FIG. 3B
, the micro-controller
22
and the synthesizer
23
in the speech synthesizer
20
of
FIG. 3A
can be replaced by a single microprocessor
26
. With this architecture, both the I/O controls and the synthesizing of speech data are performed by the microprocessor
26
.
The foregoing speech synthesizer with multi-tasking capability, however, still has a drawback in encoding. For example, the voice concatenation, which is a technique to combine a number of separate phonic elements into a continuous stream of meaningful sounds, would be very complex in algorithm that can be very difficult to code into software program. Therefore, the design of the speech synthesizer would be a very laborious and time-consuming job to carry out. The development period typically requires at least one month.
In conclusion, the prior art has the following drawbacks.
(1) First, in respect to the prior art of
FIG. 2
, although it is simple in system architecture that allows it easy to design, it is incapable of driving external devices such as motors and LEDs in a multi-tasking manner at the same time when performing the speech synthesis. Moreover, it cannot switch the output state of the I/O ports except at the break between two consecutive speech sections.
(2) Second, in respect to the prior art of
FIGS. 3A-3B
, its multi-tasking capability is complex in algorithm that would cause the programming to be very complex to implement. The development period is therefore quite long.
SUMMARY OF THE INVENTION
It is therefore an objective of the present invention to provide a speech synthesizer and a method of synthesizing speech, which is capable of driving external devices in a multi-tasking manner and which is simple in software complexity.
It is another objective of the present invention to provide a speech synthesizer and a method of synthesizing speech, which allows voice concatenation to be easy to implement either through hardware or through software.
In accordance with the foregoing and other objectives of the present invention, a new speech synthesizer and a method of synthesizing speech are provided.
The speech synthesizer of the invention includes a memory unit, a voice list pointer, a start address register, a program counter, a synthesi
Blakely & Sokoloff, Taylor & Zafman
Korzuch William R.
Winbond Electronics Corp.
{haeck over (S)}mits T{overscore (a)}livaldis Ivars
LandOfFree
Multi-tasking speech synthesizer does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multi-tasking speech synthesizer, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-tasking speech synthesizer will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2497139