Data processing: speech signal processing – linguistics – language – Speech signal processing – Application
Reexamination Certificate
1998-11-17
2001-02-20
{haeck over (S)}mits, T{overscore (a)}livaldis I. (Department: 2641)
Data processing: speech signal processing, linguistics, language
Speech signal processing
Application
C704S246000, C704S231000
Reexamination Certificate
active
06192342
ABSTRACT:
BACKGROUND
1. Field of the Invention
The present invention relates to camera presets, and, more particularly, setting a camera to view a talker based upon talker identification.
2. Description of the Related Art
Camera presets are used to help ensure that a correct talker is being viewed by a camera. Camera presets are typically implemented by manually setting pan, tilt and zoom parameters of each camera for each talker prior to the talker being recorded. For example, prior to a videoconference commencing, a camera operator focuses on each conference participant and causes a videoconference system to record the preset data for each camera for each participant. During the videoconference, the operator and/or a conference participant selects the proper camera preset depending on which talker is talking in the videoconference. Upon selection of a camera preset, the camera points and zooms to the preset location to view the talker corresponding to the camera preset. Although the implementation and use of manual presets are relatively simple, manual presets require considerable and ongoing operator intervention.
Another technique uses triangulation of sound to automatically point and zoom a camera to track and properly record a talker (e.g., a talking videoconference participant). Such a technique allows tracking the current talker even as the talker moves around a room. Such a technique is disclosed in U.S. patent application Ser. No. 09/187,081, filed Nov. 6, 1998, entitled “Acoustic Source Location Using A Microphone Array,” naming Pi Sheng Chang, Aidong Ning, Michael G. Lambert and Wayne J. Haas as inventors, and which is incorporated herein by reference in its entirety. Although effective, such a technique can be relatively complex and expensive to implement.
SUMMARY
It has been discovered that a camera can be targeted to a talker using voice recognition based on the talker's known voiceprint. Such a method and system therefor provides a more efficient solution than manual presets while providing a simpler and more inexpensive solution than triangulation auto-tracking. Such a method and system therefor inexpensively provides robust tracking capabilities which are inherently immune to acoustic problems of most tracking techniques. Specialized hardware will typically not be required as sampling of microphone audio may be done through a sound card, and known voice recognition applications may be used.
In one embodiment, a method for targeting a camera uses voice recognition analysis. Audio information is received by a talker identification (TID) module from a microphone. The TID module automatically performs a voice recognition analysis on the audio information to uniquely identify which of a plurality of talkers is talking. The camera is automatically controlled to target a camera preset location corresponding to the talker identified to be talking.
In another embodiment, an apparatus for targeting a camera includes a camera targeting controller for automatically targeting a camera to one of a plurality of camera presets responsive to receiving audio information and identifying the audio information as corresponding to talker identification information which uniquely identifies a talker and which corresponds to the one of the camera presets.
In another embodiment, a method for targeting a camera includes saving talker/camera combination information for a talker/camera combination. The talker/camera combination information includes talker identification information for identifying the talker by voice and camera preset information corresponding to the location of the talker identified by the voice pattern. The method further includes the following: determining whether subsequent talker/camera combinations are to be saved; saving subsequent talker camera combinations if subsequent talker/camera combinations are to be saved; receiving first audio information; recognizing a first talker by determining whether the first audio information corresponds to first talker identification information of the saved talker identification information; determining first camera preset information corresponding to the first talker identification information; and targeting a camera preset location indicated by the first camera preset information.
REFERENCES:
patent: 4264928 (1981-04-01), Schober
patent: 4531024 (1985-07-01), Colton et al.
patent: 5469529 (1995-11-01), Bimbot et al.
patent: 5794204 (1998-08-01), Miyazawa et al.
patent: 5959667 (1999-09-01), Maeng
Serial No. 09/187248, filed Nov. 6, 1998, entitled “Method and Apparatus for Reducing Camera Movements in a Video Conference System,” naming Wayne J. Haas and Michael G. Lambert as inventors.
Serial No. 09/187,081, filed Nov. 6, 1998, entitled “Acoustic Source Location Using a Microphone Array,” naming Pi Sheng Chang and Aidong Ning as inventors.
Serial No. 09/187,202, filed Nov. 6, 1998, entitled “Apparatus and Method for Avoding Invalid Camera Positioning in a Video Conference,” naming Michael G. Lambert and Pi Sheng Chang as inventors.
Nolan Daniel A.
Skjerven Morrill & MacPherson LLP
Terrile Stephen A.
Vtel Corporation
{haeck over (S)}mits T{overscore (a)}livaldis I.
LandOfFree
Automated camera aiming for identified talkers does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Automated camera aiming for identified talkers, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automated camera aiming for identified talkers will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2607150