Phoneme analyzer

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S209000, C704S214000, C704S271000, C704S213000

Reexamination Certificate

active

06285979

ABSTRACT:

FIELD OF THE INVENTION
Our present invention relates to a phoneme analyzer and, more particularly, to a phoneme analysis method which operates in real time and is capable of analyzing speech. Specifically, the invention is intended to detect speech sounds in real time, and to distinguish voiced speech sounds from unvoiced or voiceless speech sounds. The information obtained by such analysis can be used to enhance the speech signal in hearing aids for the hard of hearing, can be used in conjunction with noise cancelling algorithms to suppress noise in speech reproduction systems, to improve the quality of speech-to-text computer translations, and to make speech operated systems more precise with respect to the response.
The invention also relates to a method facilitating fast detection of selected speech sounds in noisy real life acoustic environments and to phoneme analysis which can be implemented using very low power electrical circuitry.
BACKGROUND OF THE INVENTION
The typical structure of speech is Vowel-Consonant-Vowel (VCV) or Consonant-Vowel-Consonant (CVC). All vowels are produced by voiced sounds, although many consonants are produced with nonvoiced or voiceless (VL) sounds. The energy peaks in voiced sounds are predominantly in lower frequencies below 3 KHz. In voiceless sounds the energy peaks are predominantly in higher frequencies above 3 KHz. There is typically more energy in voiced sounds than in voiceless sounds.
One known method to discriminate voiced from voiceless sounds is to analyze the zero-crossing frequency of speech. However this method itself cannot provide reliable detection in noisy environments. Also this method does not work well for females and children who have higher pitched voices.
For example some vowels, such as /i/, /ea/ and /e/, have higher energy peaks (second and third formats) and may generate high zero crossing frequencies. Table 1. shows an average of the first and second formants of such American vowels for male, female and child voices:
TABLE 1
Vowel
heat
hit
when
pay
1st Formant
Male
270
390
530
660
Female
310
430
610
860
Child
370
530
690
1010
2nd Formant
Male
2290
1990
1840
1720
Female
2790
2480
2330
2050
Child
3200
2730
2610
2320
In the presence of noise (typically in lower frequencies), the zero crossing of voiceless consonants may be “pulled” down to lower frequencies.
OBJECTS OF THE INVENTION
It is the principal object of the present invention to provide a real time method of analyzing speech whereby drawbacks of earlier systems can be avoided.
Another object of this invention is to provide a method of detecting speech sounds in real time and to discriminate voiced speech from voiceless speech sounds, particularly to enhance signal processing in hearing aids, noise cancelling circuitry, speech-to-text computer applications and speech operated systems generally.
A further object of the invention is to provide a phoneme analyzer which can be realized with low power electric circuitry and is capable of fast detection of speech sounds in noisy environments.
SUMMARY OF THE INVENTION
These objects and others which will become apparent hereinafter are attained, in accordance with the invention in a real time method of analyzing speech which comprises the steps of:
(a) obtaining a speech signal containing ambient noise in addition to voiced vowel sounds, low frequency voiceless sounds and high frequency voiceless sounds;
(b) detecting in the speech signal a voiced component having a frequency in a range of 200 Hz to about 1 KHz and generating a first output when the energy in the frequency range of 200 Hz to about 1 KHz is present in the speech signal;
(c) simultaneously detecting in the speech signal a voiceless component having a frequency greater than about 2.4 KHz and generating a second output when the frequency greater than about 2.4 KHz is present in the speech signal;
(d) simultaneously detecting in the speech signal a voiceless component having a frequency greater than about 3.4 KHz and generating a third output when the frequency greater than about 3.4 KHz is present in the speech signal;
(e) logically combining the first, second and third outputs to produce two-bit logic signals representing high-frequency voiceless sound, lower-frequency voiceless sound, selected vowel sounds and other voiced sounds; and
(f) controlling a speech processing device with the two-bit logic signals.
As will be described in greater detail hereinafter, step (c) is carried out preferably by analyzing for a zero crossing frequency above 4.8 KHz and in step (d) the speech signal is analyzed for a zero crossing frequency above 6.8 KHz, it being understood that the zero crossing frequency is twice the signal frequency.
According to a feature of the invention in step (b), an energy level is measured in the 200 to 1000 Hz band of the speech signal and the current measured energy level should be compared with energy level established as the base level which is measured during interval in which there is no voiced component in speech signal and only ambient noise and high frequency unvoiced speech sounds occur representing noise in the speech signal.
More particularly, the purpose of the invention is to provide reliable discrimination between the following sounds:
a) high frequency voiceless sounds such as fricatives (/s/ and /sh/) with a frequency predominantly greater than 3.4 KHz (or zero crossing frequency predominantly greater than 6.8 KHz).
b) lower frequency voiceless sounds (such as fricatives (/s/ and /sh/) in a noisy environment with a frequency predominantly greater than 2.4 KHz (or zero crossing frequency predominantly greater than 4.8 KHz).
c) high frequency vowels such as /i/, /ea/, where the predominant frequency in a female voice is around 2.7 KHz but does not exceed 3.3 KHz (even in the case of a child).
d) all other vowels and voiced sounds including nasal.
The advantage of the analysis method described herein, is its operation in the frequency domain without dependency on the amplitude. Typically the envelope of the speech has higher levels for vowels, than for voiceless consonants (or the ambient noise). The difference can be further enhanced for the vowels, /i/ /ee/ by means of band pass filter in the band 200-1000 Hz. This is because most voiceless sounds will have most of their energy above 2 KHz and the ambient noise is typically concentrated below 500 KHz. The first formant of the /i/ is around 300-400 KHz for male voice and 400-600 Hz for female voice.
The analyzer comprises a stage to detect energy in restricted frequency bands and three separate detectors of frequency detectors of frequency thresholds for:
Voiceless (VL)
detects crossing a threshold of 3.4 KHz;
e or VL
detects crossing a threshold of 2.4 KHz; and
Voiced
detects voiced component via the speech
envelope in the band 200-1000 KHz.
The logic outputs of the three detectors are combined into two-bit logic code expressing the four possible results of the phoneme analysis.
When detecting the energy of the voiced component in the restricted frequency band, the ambient noise (especially multi-talker speech noise), may interfere with the measurement by creating fluctuations of the energy in this band unrelated to the speech envelope which typically fluctuates between vowels (increased) and voiceless consonants (reduced).
In its apparatus aspects, the invention can comprise a phoneme analyzer provided with means for obtaining a speech signal containing ambient noise in addition to voiced vowel sounds, low frequency voiceless sounds and high frequency voiceless sounds, means connected to the input means for detecting a voiced component having a frequency in the range of 200 Hz to about 1 KHz and generating a first output when energy in the frequency range of 200 Hz and 1 KHz is present in the speech signal, means also connected to the input for simultaneously detecting in the speech signal a voiceless component having a frequency greater than about 2.4 KHz for generating a second output, e.g. in the form of a zero crossing detector responding at a zero cross frequency above

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Phoneme analyzer does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Phoneme analyzer, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Phoneme analyzer will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2438305

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.