Object recognition using binary image quantization and hough...

Image analysis – Applications – Target tracking or detecting

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S199000, C382S205000, C382S281000, C382S291000

Reexamination Certificate

active

06807286

ABSTRACT:

BACKGROUND
1. Technical Field
The invention is related to object recognition systems, and more particularly to a system and process for recognizing objects in an image using binary image quantization and Hough kernels.
2. Background Art
Recent efforts in the field of object recognition in images have been focused on developing processes especially suited for finding everyday objects in a so-called intelligent environment monitored by color video cameras. An intelligent environment in simple terms is a space, such as a room in a home or office, in which objects and people are monitored, and actions are taken automatically based on what is occurring in the space. Some examples of the actions that may be taken, which would require the ability to recognize objects, include:
Customizing a device's behavior based on location. A keyboard near a computer monitor could direct its input to the application(s) on that monitor. A keyboard in the hands of a particular user could direct its input to that user's application(s), and it could invoke that user's preferences (e.g., repeat rate on keys).
Finding lost objects in a room like a television remote control.
Inferring actions and intents by identifying objects that are being used by person. A user picking up a book probably wants to read, and the lights and music could be adjusted appropriately.
Unfortunately, existing object recognition algorithms that could be employed for use in an intelligent environment are not written with a normal consumer in mind. This has lead to programs that would be impractical to use for a mass market audience. These impracticalities include speed of execution, elaborate training rituals, and the setting of numerous adjustable parameters.
It is believed that for an object recognition program to be generally acceptable to a typical person who would like to benefit from an intelligent environment, it would have to exhibit the following attributes. Besides the usual requirements for being robust to background clutter and partial occlusion, a desirable object recognition program should also run at moderate speeds on relatively inexpensive hardware. The program should also be simple to train and the number of parameters that a user would be expected to set should be kept to a minimum.
The present invention provides an object recognition system and process that exhibits the foregoing desired attributes.
It is noted that in the remainder of this specification, the description refers to various individual publications identified by a numeric designator contained within a pair of brackets. For example, such a reference may be identified by reciting, “reference [1]” or simply “[1]”. Multiple references will be identified by a pair of brackets containing more than one designator, for example, [1, 2]. A listing of the publications corresponding to each designator can be found at the end of the Detailed Description section.
SUMMARY
The present invention is embodied in a new object recognition system and process that is capable of finding objects in an environment monitored by color video cameras. This new system and process can be trained with only a few images of the object captured using a standard color video camera. In tested embodiments, the present object recognition program was found to run at 0.7 Hz on a typical 500 MHz PC. It also requires a user to set only two parameters. Thus, the desired attributes of speed, simplicity, and use of inexpensive equipment are realized.
Generally, the present object recognition process represents an object's features as small, binary quantized edge templates, and it represents the object's geometry with “Hough kernels”. The Hough kernels implement a variant of the generalized Hough transform using simple, 2D image correlation. The process also uses color information to eliminate parts of the image from consideration.
The present object recognition system and process must first be trained to recognize an object it is desired to find in an image that may contain the object. Specifically, the generation of training images begins with capturing a color image that depicts a face or surface of interest on the object it is desired to recognize. A separate image is made for each such surface of the object. Although not absolutely required, each of these images is preferably captured with the normal vector of the surface of the object approximately pointed at the camera—i.e., as coincident with the camera's optical axis as is feasible. The surface of the object in each of the images is next identified. Preferably, the surface of the object is identified manually by a user of the present system. For example, this can be accomplished by displaying the image on a computer screen and allowing the user to outline the surface of the object in the image, using for example a computer mouse. The pixels contained within the outlined portion of the image would then be extracted. The extracted portion of each image becomes a base training image. Each base training image is used to synthetically generate other related training images showing the object in other orientations and sizes. Thus, the user is only required to capture and identify the object in a small number of images.
For each of the base training images depicting a surface of interest of the object, the following procedure can be used to generate synthetic training images. These synthetic training images will be used to train the object recognition system. An image of the object is synthetically generated in each of a prescribed number of orientations. This can be accomplished by synthetically pointing the object's surface normal at one of a prescribed number (e.g., 31) of nodes of a tessellated hemisphere defined as overlying the object's surface, and then simulating a paraperspective projection representation of the surface. Each node of the hemisphere is preferably spaced at the same prescribed interval from each adjacent node (e.g., 20 degrees), and it is preferred that no nodes be simulated within a prescribed distance from the equator of the hemisphere (e.g., 20 degrees). Additional synthetic training images are generated by synthetically rotating each of the previously simulated training images about its surface normal vector and simulating an image of the object at each of a prescribed number of intervals (e.g., every 20 degrees).
Additional training images can be generated by incrementally scaling the synthesizing training images of the object. This would produce synthesized training images depicting the object at different sizes for each orientation. In this way the scale of the object in the image being search would be irrelevant once the system is trained using the scaled training images.
Once the synthetic training images are generated, each is abstracted for further processing. The preferred method of abstraction involves characterizing the images as a collection of edge points (or more particularly, pixels representing an edge in the image). The resulting characterization can be thought of as an edge pixel image. The edge pixels in each synthetic training image are preferably detected using a standard Canny edge detection technique. This particular technique uses gray level pixel intensities to find the edges in an image. Accordingly, as the base images and the synthetically generated training images are color images, the overall pixel intensity component (i.e., R+G+B) for each pixel is computed and used in the Canny edge detection procedure.
A binary raw edge feature is generated for each edge pixel identified in the synthetic training images by the edge detection procedure. A binary raw edge feature is defined by a sub-window of a prescribed size (e.g., 7×7 pixels) that is centered on an edge pixel of the synthetic training images. Each edge pixel contained in the a feature is designated by one binary state, for example a “0”, and the non-edge pixels are designated by the other binary state, for example a “1”.
Typically, a very large nu

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Object recognition using binary image quantization and hough... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Object recognition using binary image quantization and hough..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Object recognition using binary image quantization and hough... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3273175

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.