2-D processing of speech

Data processing: speech signal processing – linguistics – language – Speech signal processing – For storage or transmission

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07574352

ABSTRACT:
Acoustic signals are analyzed by two-dimensional (2-D) processing of the one-dimensional (1-D) speech signal in the time-frequency plane. The short-space 2-D Fourier transform of a frequency-related representation (e.g., spectrogram) of the signal is obtained. The 2-D transformation maps harmonically-related signal components to a concentrated entity in the new 2-D plane (compressed frequency-related representation). The series of operations to produce the compressed frequency-related representation is referred to as the “grating compression transform” (GCT), consistent with sine-wave grating patterns in the frequency-related representation reduced to smeared impulses. The GCT provides for speech pitch estimation. The operations may, for example, determine pitch estimates of voiced speech or provide noise filtering or speaker separation in a multiple speaker acoustic signal.

REFERENCES:
patent: 5377302 (1994-12-01), Tsiang
patent: 6061648 (2000-05-01), Saito
patent: 2 280 827 (1995-02-01), None
Qiu et al. “Pitch determination of noisy speech using wavelet transform in time and frequency domains”, Oct. 19-21, 1993, IEEE TENCON '93, Beijing, vol. 3, pp. 337-340.
Openshaw et al. “Noise robust estimate of speech dynamics for speaker recognition”, Proc. ICSLP 96, 1996, pp. 925-928.
Mellor et al. “Noise masking in a transform domain”, ICASSP-93, vol. 2, 1993, pp. 87-90.
Hess, W. “An algorithm for digital time-domain pitch period determination of speech signals and its application to detect F0 dynamics in VCV utterances”, Apr. 1976, ICASSP '76, vol. 1, pp. 322-325.
Terez, D.E., “Robust pitch determination using nonlinear state-space embedding”, vol. 1, 2002, ICASSP '02, pp. 1-345-1-348.
Kinsner, W. “Speech and image signal compression with wavelets”, WESCANEX 93, May 17-18, 1993, pp. 368-375.
Nawab, S.H. et al., “Signal Reconstruction from Short-Time Fourier Transform Magnitude,”IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-31, No. 4, Aug. 1983, pp. 986-998.
Quatieri, T.F. et al., “Frequency sampling of short-time Fourier-transform magnitude for signal reconstruction,”J. Opt. Soc. Am., 73:11 (1523-1526) Nov. 1983.
Swartz, B. and N. Magotra, “Feature Extraction for Automatic Speech Recognition (ASR) ,”Thirtieth Asilomar Conference on Signals, Systems&Computers, Nov. 3-6, 1996, pp. 748-752.
Ahmadi, M. et al., “Phoneme Recognition Using Speech Image (Spectrogram) ,”Proceedings of ICSP '96, pp. 675-677.
Tanaka, Y. and H. Kimura, “Low-Bit-Rate Speech Coding Using a Two-Dimensional Transform of Residual Signals and Waveform Interpolation,”Proc. 1994 IEEE International Conference on Acoustics, Speech and Signal Processing, Apr. 1994, pp. I-173-I-176.
Terada, T. et al., “Nonstationary Waveform Analysis and Synthesis Using Generalized Harmonic Analysis,”Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, Oct. 25-28, 1994, pp. 429-432.
Ariki, Y. et al., “Acoustic Noise Reduction by Two Dimensional Spectral Smoothing and Spectral Amplitude Transformation,”ICASSP 86, Tokyo, pp. 97-100.
Woods, J.W. and V.K. Ingle, “Two Dimensional Processing of Spectrogram Data,”Proc. 1978 IEEE International Conference on Acoustics, Speech and Signal, Apr. 10-12, 1978, pp. 39-42.
Chan, C.P. et al., “Two-Dimesional Multi-Resolution Analysis of Speech Signals and its Application to Speech Recognition,”Proceedings of 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 405-408.
Quatieri, T., “2-D Processing of Speech With Application to Pitch Estimation”,Int. Conf. On Spoken Language Processing ICSLP '02, Sep. 16-20, 2002, XP002270661.
Hinich, M., et al., “Bispectral Analysis of Speech”,Applied Research Laboratories, The University of Texas at Austin, pp. 357-360.
Van De Wouwer, G., et al., “Voice Recognition From Spectrograms: A Wavelet Based Approach”,World Scientific Publishing Company, Apr. 1997, pp. 165-172, XP008027609.
Kitamura, T., et al., “Pitch Determination by Two-Dimensional Cepstrum”,Bull. P.M.E.(T.I.T.), No. 37, 1976, pp. 25-32, XP008027607.
R.J. McAulay and T.F. Quatieri, “Pitch estimation and voicing detection based on a sinusoidal speech model,” Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Albuquerque, N.M., pp. 249-252, 1990).
Chi, T., et al., “Spectro-remporal modulation transfer functions and speech intelligibility,”J. Acoust. Soc. Am.,106(5): 2719-2732 (1999).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

2-D processing of speech does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with 2-D processing of speech, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and 2-D processing of speech will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4102494

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.