Dynamic load-balancing between two processing means for...

Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C709S241000

Reexamination Certificate

active

06748019

ABSTRACT:

TECHNICAL FIELD
The present invention relates to real-time video compression and coding with multiple processors, and more specifically relates to a method for optimum use of the computing power available for the improvement of the performance of such systems.
BACKGROUND TO THE INVENTION
The most successful techniques for video compression and coding, including the international standards for moving images H.261/3 and MPEG-1/2/4, are based on motion estimation (ME), discrete cosine transform (DCT) and quantisation.
FIG. 1
shows the basic procedure by which this is performed. The video input undergoes a series of processes including discrete cosine transform (DCT)
10
, quantization (Q)
11
, inverse quantization (Q
−1
)
12
, inverse discrete cosine transform (IDCT)
13
, frame store (FS)
14
, interpolation
15
, motion estimation (ME)
16
, motion compensation (MC)
17
, source coding (SC)
18
and buffering
19
. The intensive computation involved in these techniques poses challenges to real-time implementation.
Performance with a single processor is limited by silicon technology. Parallel processing is the way to achieve the necessary speedup for video coding with existing algorithms and computing power. However, load balancing among processors is difficult because of the data dependent nature of video processing. Imbalance in load results in a waste of computing power.
The load-balancing problem has been tackled with block-, MB (macroblock)—or GOB (group of blocks)—based approaches due to the inherent data dependency in an image neighbourhood. This is described in, for example, T. Fujii and N. Ohata, “A load balancing technique for video signal processing on a multicomputer type DSP”, Proc. IEEE Int'l Conf. Acoustics, Speech, Image Processing, USA, 1988, pp.1981-1984, and K.Asano et al, “An ASIC approach to a video coding processor”, Proc. Int'l Conf. Image Processing and its Applications, London, 1992, pp.127-130. Such an approach has various disadvantages: (1) the computational complexity for adjoining MBs/GOBs may not be similar, especially when they are coded (inter- or intra- ) differently or lie on the border of motion; (2) extra computation and storage are required in GOB-based approaches because source coding (e.g., VLC, SAC) is sequential; (3) larger programs and working memory (very often on-chip memory space) are usually needed in real-time implementation because of the need of dedicated code for each processor; (4) some approaches require measurement of timing for every block's processing, and this is not feasible in real-time implementation; (5) there is no opportunity for dynamic load-balancing (the actual amount of computation depends on input pictures).
In the case of two processors, in order to avoid or at least alleviate the problems (1) to (3) listed above, the encoding process for an MB can be divided. This is described in, for example, W. Lin, K. H. Goh, B. J. Tye, G. A. Powell, T. Ohya and S. Adachi, “Real time H.263 video codec using parallel DSP”, Proc. IEEE Int'l Conference Image Processing, USA, 1997, pp586-589, also in B. J. Tye, K. H. Goh, W. Lin, G. A. Powell, T. Ohya and S. Adachi, “DSP implementation of very low bit rate videoconferencing system”, Proc. Int'l Conf. Information, Communications and Signal Processing, Singapore, 1997, pp1275-1278, and in K. H. Goh, W. Lin, B. J. Tye, G. A. Powell, T. Ohya and S. Adachi, “Real time full-duplex H.263 video codec system”, Proc. IEEE First Workshop on Multimedia Signal Processing, USA, 1997, pp445-450. The division of the encoding process is as follows:
Processor
1
(p
1
)—interpolation, ME, and all auxiliary processing before ME
referred to as ME sub-process hereafter for convenience;
Processor
2
(p
2
)—DCT, IDCT, Quant, Dequant, source coding (SC), and all auxiliary processing after SC
referred to as DCT-Q-SC sub-process hereafter for convenience.
The ME sub-process in p is carried out independently, regardless of the progress in p
2
, until the end of a frame. The DCT-Q-SC sub-process in P
2
is also carried out independently provided that there are ME-completed MBs waiting for DCT processing.
FIG. 2
depicts a status table for a QCIF Frame with 99 macroblocks showing the progress of p
1
and P
2
. A frame
20
consists of a series of macroblocks (MBs)
21
. In this example, the first twelve MBs
22
have had DCT-Q-SC completed. ME but not DCT has been completed on a further seven MBs
23
, so the total number of MBs which have had ME completed is nineteen. The rest of the MBs
24
have not yet had ME completed.
In real-time applications, motion search can be performed in an efficient way, such as the 3-stage search or a pruning approach (described in Lin et al, Tye et al and Goh et al mentioned above) to meet the frame-rate requirement. The amount of computation for the DCT-Q-SC sub-process can be controlled in P
2
as shown in FIG.
3
. When the minimum SAD (sum of absolute difference) for an MB is less than T
dct
(a threshold to be determined), the DCT-Q will be skipped. The reasoning for this is that when the sum of grey-level difference for an MB against its match in the previous frame is less than a certain value, it is acceptable to code just motion vectors and no DCT coefficients need be processed.
SUMMARY OF THE INVENTION
The object of the invention is to provide a simple and yet effective method to achieve load balance dynamically and adaptively while processing time is minimised, at the same time maintaining reasonable image quality and bit count, or image quality is optimised, at the same time maintaining processing time and bit count.
In general, more than one processor can be used to implement each of the sub-processes for an MB: Processing means
1
(P
1
)—ME sub-process; Processing means
2
(P
2
)—DCT-Q-SC sub-process.
In real-time applications, the processing time and coded bit length are major considerations if image quality is acceptable. The invention provides that when the strategy of ME in P
1
is set, the DCT-Q-SC subprocess can be adjusted for optimum use of P
2
'S computing power.
According to the invention, the amount of computation in P
2
should be reduced when P
1
progresses faster than P
2
; otherwise the amount of computation in P
2
should be increased. The amount of computation in P
1
is determined by the ME strategy adopted, and is therefore kept largely constant for similar images, except for some small variation due to the effect in DCT/IDCT/Quantisation/Dequantisation.
The amount of computation in P
2
may be controlled by T
dct
. T
dct
is therefore decided adaptively for each MB in order to optimise the performance of the system, according to the dynamic load pattern between P
1
and P
2
. Determination of load pattern should not introduce significant additional computation to the video encoding process. Preferably, therefore, measures for the idle time in P
2
and the progress of ME in P
1
may be used.
The invented method balances the workload for video encoding with multi-processors according to dynamic load pattern, rather than dividing the task according to predetermined statistical information. The control is therefore more effective, aiming at optimum use of the computing resources for best performance. An additional advantage is that the additional computation brought to the video encoder by this load balancing method is negligible.


REFERENCES:
patent: 5329318 (1994-07-01), Keith
patent: 5386233 (1995-01-01), Keith
patent: 5394189 (1995-02-01), Montomura et al.
patent: 5557538 (1996-09-01), Retter et al.
patent: 6122400 (2000-09-01), Reitmeier
patent: 6292822 (2001-09-01), Hardwick
patent: 6345041 (2002-02-01), Kimball et al.
Yung and K.C. Chu, fast and parallel video encoding by workload balancing, 1/98, 1998 IEEE, 0-7803-4778, 4642-4647.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Dynamic load-balancing between two processing means for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Dynamic load-balancing between two processing means for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Dynamic load-balancing between two processing means for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3297490

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.