Image analysis – Applications – Personnel identification
Reexamination Certificate
2000-05-26
2003-12-30
Ahmed, Samir (Department: 2623)
Image analysis
Applications
Personnel identification
C382S159000
Reexamination Certificate
active
06671391
ABSTRACT:
BACKGROUND
1. Technical Field
This invention is directed toward a face detection system and process for detecting the presence of faces of people depicted in an input image, and more particularly to such a face detection system and process that also identifies the face pose of each detected face.
2. Background Art
Face detection systems essentially operate by scanning an image for regions having attributes which would indicate that a region contains a person's face. To date, current systems are very limited in that detection is only possible in regions associated with a frontal view of a person's face. In addition, current detection systems have difficulty in detecting face regions in images having different lighting conditions or faces at different scales than the system was initially designed to handle.
The problem of detecting the faces of people depicted in an image from the appearance of their face has been studied for many years. Face recognition systems and processes essentially operate by comparing some type of training images depicting people's faces (or representations thereof) to an image or representation of a person's face extracted from an input image. In the past, most of these systems required that both the original training images and the input image region be essentially frontal views of the person. This is limiting in that to obtain the input images containing a frontal view of the face of the person being identified, that person had to either be purposefully positioned in front of a camera, or a frontal view had to be found and extracted from a non-staged input image (assuming such a frontal view exists in the image).
More recently there have been attempts to build a face detection and recognition system that works with faces rotated out of plane. For example, one approach for recognizing faces under varying poses is the Active Appearance Model proposed by Cootes et al. [1], which deforms a generic 3-D face model to fit the input image and uses control parameters as a feature fed to a classifier. Another approach is based on transforming an input image into stored prototypical faces and then using direct template matching to recognize the person whose face is depicted in the input image. This method is explored in the papers by Beymer [2], Poggio [3] and Vetter [4].
It is noted that in the preceding paragraphs, as well as in the remainder of this specification, the description refers to various individual publications identified by a numeric designator contained within a pair of brackets. For example, such a reference may be identified by reciting, “reference [1]” or simply “[1]”. A listing of the publications corresponding to each designator can be found at the end of the Detailed Description section.
SUMMARY
The present invention is directed toward a face detection system and process that overcomes the aforementioned limitations in prior face detection and recognition systems by making it possible to detect a person's face in input images containing either frontal or non-frontal views of the person's face, regardless of the scale or illumination conditions associated with the face. Thus, a non-staged image, such as a frame from a video camera monitoring a scene, can be searched to detect a region depicting the face of a person, without regard to whether the person is directly facing at the camera. Essentially, as long as the person's face is visible in the image being searched, the present face detection system can be used to detect the location of the face in the image. To date there have not been any face detection systems that could detect a person's face in non-frontal views. In addition, the present invention can be used to not only detect a person's face, but also provide pose information. This pose information can be quite useful. For example, knowing which way a person is facing can be useful in user interface and interactive applications where a system would respond differently depending on where a person is looking. Having pose information can also be useful in making more accurate 3D reconstructions from images of the scene. For instance, knowing that a person is facing another person can indicate the first person is talking to the second person. This is useful in such applications as virtual meeting reconstructions.
Because the present face detection system and associated process can be used to detect both frontal and non-frontal views of a person's face, it is termed a pose-adaptive face detection system. For convenience in describing the system and process, the term “pose” will refer to the particular pitch, roll and yaw angles that describe the position of a person's head (where the 0 degree pitch, roll and yaw position corresponds to a person facing the camera with their face centered about the camera's optical axis).
The pose-adaptive face detection system and process must first be trained before it can detect face regions in an input image. This training phase generally involves first capturing images of the faces of a plurality of people. As will be explained later, the captured face images will be used to train a series of Support Vector Machines (SVMs). Each SVM will be dedicated to a particular face pose, or more precisely a pose range. Accordingly, the captured face images should depict people having a variety of face poses. Only those face images depicting a person with a face pose that falls within the particular pose range of a SVM will be used to train that SVM. It is noted that the more diverse the training face images are, the more accurate the detecting capability of the SVM will become. Thus, it is preferred that the face images depict people which are not generally too similar in appearance. The training images can be captured in a variety of ways. One preferred method would involve positioning a subject in front of a video camera and capturing images (i.e., video frames) as the subject moves his or her head in a prescribed manner.
The captured face images are preprocessed to prepare them for input into the appropriate SVM. In general, this will involve normalizing, cropping, categorizing and finally abstracting the face images. Normalizing the training images preferably entails normalizing the scale of the images by resizing the images. It is noted that this action could be skipped if the images are captured at the desired scale thus eliminating the need for resizing. The desired scale for the face images is approximately the size of the smallest face region expected to be found in the input images that are to be searched. In a tested embodiment of the present invention, an image size of about 20 by 20 pixels was used with success. The image could additionally be normalized in regards to the eye locations within the image. In other words, each image would be adjusted so that the eye locations fell within a prescribed area. These normalization actions are performed so that each of the training images generally match as to orientation and size. The images are also preferably cropped to eliminate unneeded portions which could contribute to noise in the upcoming abstraction process. It is noted that the training images could be cropped first and then normalized, if desired. It is also noted that a histogram equalization, or similar procedure, could be employed to reduce the effects of illumination differences in the images that could introduce noise into the detecting process.
The next action in the training image preprocessing procedure involves categorizing the normalized and cropped images according to their pose. One preferred way of accomplishing this action is to group the images into a set of prescribed pose ranges. It is noted that the persons in the training images could be depicted with any combination of pitch, roll and yaw angles, as long as at least a portion of their face is visible. In such a case, the normalized and cropped images would be categorized into pose ranges defined by all three dir
Yong Ma
Zhang Hong-Jiang
Ahmed Samir
Kibler Virginia
Lyon Katrina A.
Lyon & Harr LLP
Microsoft Corp.
LandOfFree
Pose-adaptive face detection system and process does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Pose-adaptive face detection system and process, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pose-adaptive face detection system and process will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3105469