Method and apparatus for segmenting images prior to coding

Image analysis – Image segmentation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06301385

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates generally to video coding and more particularly to video coding in which the image is decomposed into objects prior to coding. Each of the individual objects is then coded separately.
For many image transmission and storage applications, significant data compression may be achieved if the trajectories of moving objects in the images are successfully estimated. Traditionally, block-oriented motion estimation has been widely investigated due to its simplicity and effectiveness. However, block and object boundaries in a scene normally may not coincide because the blocks are not adapted to the image contents. This can lead to visible distortions in low bit rate-coders, known as blurring and mosquito effects.
Object-oriented coding techniques were developed to overcome the disadvantages of block-oriented coding. In one type of object-oriented coding, the image sequence is segmented into moving objects. Large regions with homogeneous motion can be extracted, resulting in higher compression and reduced motion boundary visible distortions. As the foreground objects carry more new information relative to the slowly changing background, the background can be transmitted less frequently than the foreground. Consequently, the foreground objects must be correctly identified to achieve the desired compression levels without adding undue distortion.
As a result, segmentation is an important intermediate step in object-oriented image processing. For this reason, many approaches to segmentation have been attempted, such as motion-based, focus-based, intensity-based, and disparity-based segmentation. The problem with each of these approaches is their feature specificity, which limits the scenes to which they are successfully applied. For example, the scene must contain motion for motion-based segmentation to be applicable. The scene must contain significant contrast to supply intensity-based segmentation. Similar features are required for the other approaches. In addition, the motion-based approach fails for scenes containing both foreground and background motion, such as moving foreground shadows cast onto the background. The focus-based approach also fails when the foreground is blurred. The intensity-based approach fails for textured objects because a single object erroneously segments into multiple objects. And the measurement of disparity in the disparity-based approach is complex and error-prone.
One technique is to use a priori knowledge about the images to select the coding method, which overcomes this problem. However, this makes image coding inconvenient in that processing must include a determination of the type of image and then a selection of the most appropriate coding type for that image. This significantly increases preprocessing costs of the images prior to coding. Alternatively, a lower quality coding must be employed. Unfortunately, neither of these alternatives is acceptable as bandwidth remains limited for image transmission and consumers expect higher quality imagery with increased technology.
The issue then becomes how to accentuate the strengths of these methods and attenuate their failings in foreground and background segmentation. Several possibilities have been examined. One approach combines motion and brightness information into a single segmentation procedure which determines the boundaries of moving objects. Again, this approach will not work well because the moving background will be segmented with the moving foreground and therefore classified and coded as foreground.
Another approach uses a defocusing and a motion detection to segment a foreground portion of the image from a background portion of the image. This process is shown in
FIGS. 7-9
.
FIG. 7
shows the process,
FIG. 8
shows the segmentation results over several frames, and
FIG. 9
shows the results of the defocus measurement. However, this approach requires a filling step to the process. Filling is a non-trivial problem, especially where the foreground image segment output by this process results in objects without closed boundaries. In this case, significant complexity is added to the overall process. Given the complexity inherent in video coding, the elimination of any complex step is significant in and of itself
The present invention is therefore directed to the problem of developing a method and apparatus for segmenting foreground from background in an image sequence prior to coding the image, which method and apparatus requires no a priori knowledge regarding the image to be segmented and yet is relatively simple to implement.
SUMMARY OF THE INVENTION
The present invention solves this problem by integrating multiple segmentation techniques by using a neural network to apply the appropriate weights to the segmentation mapping determined by each of the separate techniques. In this case, the neural network has been trained using images that were segmented by hand. Once trained, the neural network assigns the appropriate weights to the segmentation maps determined by the various techniques.
One embodiment of the method according to the present invention calculates the motion, focus and intensity segmentation maps of the image, and passes each of these maps to a neural network, which calculates the final segmentation map, which is then used to outline the segmented foreground on the original image. In this embodiment, two consecutive images are acquired for use in detecting the various segmentation maps input to the neural network.
According to the present invention, the step of detecting motion includes detecting a difference between pixels in successive frames and determining that a pixel is in motion if the difference for that pixel exceeds a predetermined threshold. The step of detecting focus includes calculating the magnitude of the Sobel edge detection over an nxn pixel square and dividing the magnitude of the Sobel edge detection by the edge width. The step of detecting intensity comprises determining a gray level of the pixel.
Another embodiment of the method of the present invention for processing an image sequence to segment the foreground from the background, includes acquiring successive images in the sequence, simultaneously measuring motion, focus and intensity of pixels within successive images, inputting the motion, focus and intensity measurements to a neural network, calculating foreground and background segments using the motion, focus, and intensity measurements with the neural network, and drawing a segment map based on the calculated foreground and background segments.
In an advantageous implementation of the above methods according to the present invention, it is possible to speed the training of the neural network using an adaptive learning rate. One possible embodiment of the adaptive learning rate is the following equation:
&Dgr;w=lr*dp
T
&Dgr;b=lr*d
where w is a layer's weights, b is a layer's bias, lr is the adaptive learning rate, d is the layer's delta vectors and p is the layer's input vector, and T indicates that vector p is first transposed before being multiplied.
An apparatus for segmenting the foreground and background from a sequence of images according to the present invention includes a motion detector, a focus detector, an intensity detector and a neural network. The motion detector detects motion of pixels within the image sequence and outputs a motion segmentation map. The focus detector detects pixels that are in focus and outputs a focus segmentation map. The intensity detector detects those pixels that have high intensity and those with low intensity and outputs an intensity segmentation map. The neural network is coupled to the motion detector, the focus detector and the intensity detector, and weighs the outputs from these detectors and outputs a final segmentation map.
One advantageous implementation of the neural network used in the present invention includes a two layer neural network. In this case, the neural network has a hidden layer with two neurons an

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for segmenting images prior to coding does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for segmenting images prior to coding, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for segmenting images prior to coding will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2558668

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.