Prosody based audio/visual co-analysis for co-verbal gesture...

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S276000

Reexamination Certificate

active

07321854

ABSTRACT:
The present method incorporates audio and visual cues from human gesticulation for automatic recognition. The methodology articulates a framework for co-analyzing gestures and prosodic elements of a person's speech. The methodology can be applied to a wide range of algorithms involving analysis of gesticulating individuals. The examples of interactive technology applications can range from information kiosks to personal computers. The video analysis of human activity provides a basis for the development of automated surveillance technologies in public places such as airports, shopping malls, and sporting events.

REFERENCES:
patent: H1497 (1995-10-01), Marshall
patent: 6275806 (2001-08-01), Pertrushin
patent: 6788809 (2004-09-01), Grzeszczuk et al.
patent: 7036094 (2006-04-01), Cohen et al.
patent: 2006/0210112 (2006-09-01), Cohen et al.
Nakamura et al. “multimodal multi-view integrated database for human behavior understanding” 1998 IEEE pp. 540-545.
Bolt, “Put-that-there: Voice and gesture at the graphic interface,”In SIGGRAPH-Computer Graphics, 1980.
Lenman et al., “Computer Vision Based Hand Gesture Interfaces for Human-Computer Interaction,” KTH (Royal Institute of Technology), Stockholm, Sweden, Jun. 2002.
Pavlovic et al., “Visual interpretation of hand gestures for human-computer interaction: A review,”IEEE Trans on Pattern Analysis and Machine Intelligence, vol 19, pp. 677-695, 1997.
Kettebekov et al., “Toward Natural Gesture/Speech Control of a Large Display,”Engineering for Human Computer Interaction, vol. 2254, “Lecture Notes in Computer Science,” Little & Nigay, eds., Berlin Heidelberg New York: Springer Verlag, pp. 133-146, 2001.
Krahnstoever et al., “A Real-Time Framework for Natural Multimodal Interaction with Large Screen Displays,”Proc. Intl. Conf. On Multimodal Interfaces, Pittsburgh, USA, pp. 349-354, 2002.
Sharma et al., “Speech/gesture interface to a visual-computing environment,”IEEE Computer Graphics and Applications, vol. 20, pp. 29-37, 2000.
Rigoll et al., “High Performance Real-Time Gesture Recognition Using Hidden Markov Models,”Gesture and Sign Language in Human-Computer Interaction, vol. LNAI 1371, M. Frohlich, ed., pp. 69-80, 1997.
Wilson et al., “Hidden Markov Models for Modeling and Recognizing Gesture Under Variation,”Hidden Markov Models: Applications in Computer Vision., T. Caelli, ed.: World Scientific, pp. 123-160, 2001.
Oviatt et al., “Integration and synchronization of input modes during multimodal human-computer interaction,”Proc. Conf. On Human Factors in Computing Systems(CHI '97), pp. 415-422, 1997.
Oviatt, S., “Taming Speech Recognition Errors Within a Multimodal Interface,”Communications of the ACM, vol. 43(9), pp. 45-51, 2000.
Sowa et al., “Coverbal iconic gestures for object descriptions in virtual environments: An empirical study,”Proc. Conf. On Gestures: Meaning and Use,j Porto, Portugal, 1999.
Sowa et al., “Temporal Symbolic Integration Applied to a Multimodal System Using Gestures and Speech,”Gesture-Based Communication in Human-Computer Interaction, vol. LNAI 1739, A.B. E. Al., Ed., Berlin: Springer-Verlag, pp. 291-302, 1999.
deCuetos et al., “Audio-visual intent-to-speak detection for human-computer interaction,”Proc. ICASSP, Istanbul, Turkey, pp. 1325-1328, 2000.
Luettin et al., “Continuous Audio-Visual Speech Recognition,”Proc. 5thEuropean Conf on Computer Vision, 1998.
Benoit et al., “Audio-Visual and Multimodal Speech Systems,”Handbook of Standards and Resources for Spoken Language Systems, D. Gibbon, Ed., 1998.
Basu et al., “Towards Measuring Human Interactions in Conversational Settings,”Proc. IEEE Intl. Workshop on Cues in Communication(CUES 2001) at DVPR 2001, Kauai, Hawaii, 2001.
Graf et al., “Visual Prosody: Facial Movements Accompanying Speech,”Proc. Fifth IEEE Intl. Conf. On Automatic Face and Gesture Recognition(FG'02), 2002.
Kettebekov and Sharma, “Understanding gestures in multimodal human computer interaction,”Intl. J. On Artificial Intelligence Tools, vol. 9, pp. 205-224, 2000.
Kita et al., “Movement phases in signs and co-speech g estures, and their transcription by human coders,”Proc. Intl. Gesture Workshop, pp. 23-35, 1997.
Quek et al., “Gesture and speech cues for conversational interaction,” submitted toACM Transactions on Computer-Human Interaction, also asVISLAb Report: VISLab-01-01, 2001.
Azoz et al., “Reliable tracking of human arm dynamics by multiple cue integration and constraint fusion,”Proc. IEEE Conf. On Computer Vision and Pattern Recognition, 1998.
Boersma, P., “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ration of a sampled sound,”Institute of Phonetic Sciences of the University of Amsterdam, vol. 17, pp. 97-110, 1993.
Beckman, M.E., “The parsing of prosody,”Language and Cognitive Processes, vol. 11, pp. 17-67, 1996.
Taylor, P., “The rise/fall/connection model of intonation,”Speech Communication, vol. 15, pp. 169-186, 1994.
Yeo et al., “A new family of power transformations to improve nomality or symmetry,”Biometrica, vol. 87, pp. 954-959, 2000.
Chickering et al., “A Bayesian approach to learning Bayesian networks with local structure,”Proc. 13thConf. On Uncertainty in Artificial Intelligence, Providence, RI, pp. 80-89, 1997.
Wexelblat, A., “An Approach to Natural Gesture in Virtual Environments,” MIT Media Laboratory, pp. 1-17, 1995.
Sharma et al., “Speech-Gesture Driven Multimodel Interfaces for Crisis Management,” submitted toProc. Of IEEE Special Issue on Multimodal Human-Computer Interface, pp. 1-48 (no publication date provided).
Poddar et al., “Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration,” pp. 1-6 (no publication date provided).
Wilson et al., “Realtime Online Adaptive Gesture Recognition,” (no publication date provided).
Wilson et al., “Parametric Hidden Markov Models for Gesture Recognition,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, No. 9, pp. 884-900, 1999.
Kim and Chien, “Analysis of 3D Hand Trajectory Gestures Using Stroke-Based Composite Hidden Markov Models,”Applied Intelligence, vol. 15, pp. 131-143, 2001.
Gullberg, M., “Gesture in Spatial Descriptions,”Working Papers, Lund University, Dept. Of Linguistics, vol. 47, pp. 87-97, 1999.
McNeill, D., “Gesture and Language Dialectic,”Acta Linguistica Hafniensiagreen, pp. 1-25, 2002.
Henriques et al., “Geometric computations underlying eye-hand coordination: orientations of the two eyes and the head,”Exp. Brain Res., vol. 152, pp. 70-78, 2003.
Desmurget et al., “Constrained and Unconstrained Movements Involve Different Control Strategies,”Rapid Communication, The American Physiological Society, pp. 1644-1650, 1997.
Krauss and Hadar, “The Role of Speech-Related Arm/Hand Gestures in Word Retrieval,” pre-editing version of a chapter that appeard in R. Campbell & L. Messing (eds.),Gesture, speech and sign, 93-116, 2001.
Kingston, J., “Extrapolating from Spoken to Signed Prosody via Laboratory Phonology,”Language and Speech, vol. 42 (2-3), pp. 251-281, 1999.
Wachsmuth, I., “Communicative Rhythm in Gesture and Speech,” (no publication date provided).
Nakatani, C.H., “The Computational Processing of Intonational Prominence: A Functional Prosody Perspective,” thesis presented to the Division of Engineering and Applied Sciences, Harvard University, 1997.
Wolff et al., “Acting on a visual world: the role of perception in multimodal HCI,” (no publication date provided).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Prosody based audio/visual co-analysis for co-verbal gesture... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Prosody based audio/visual co-analysis for co-verbal gesture..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Prosody based audio/visual co-analysis for co-verbal gesture... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2796331

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.