Parallel processing processor and parallel processing method

Computer graphics processing and selective visual display system – Computer graphic processing system – Plural graphics processors

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Parallel processing processor and parallel processing method Parallel processing processor and parallel processing method

: 2003-04-15
: 2004-06-22
: Tung, Kee M. (Department: 2676)
: Computer graphics processing and selective visual display system
: Computer graphic processing system
: Plural graphics processors

: C345S559000, C345S522000, C345S561000, C345S592000
: Reexamination Certificate
: active
: 06753866
: ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to a parallel processing processor and a parallel processing method. More particularly, the invention relates to a parallel processing processor and a parallel processing method for use therewith, the processor comprising a facility for processing in a dedicated structure &agr; data representing the transparency of images, whereby image processing is performed at high speed.
Standards for digital moving pictures coding specify ways to divide a picture into blocks of a specific size each, to predict motions of each block and to code predictive errors per block. Such block-by-block processing is carried out effectively by software when the latter is run on a processor capable of performing operations on numerous pixels in response to a single instruction. Among many definitions of processor, those published by Flynn in 1966 are well known and widely accepted (“Very high-speed computing systems,” Proc. IEEE, 12, 1091-9; Flynn, M. J., 1966). Processors, as defined by Flynn in the cited publication, fall into four categories: SISD (single instruction stream-single data stream) type, MIMD (multiple instruction stream-multiple data stream) type, SIMD (single instruction stream-multiple data stream) type, and MISD (multiple instruction stream-single data stream) type. The processor suitable for the above-mentioned block-by-block processing belongs to the SIMD type. According to Flynn, the SIMD type processor is characterized in that “multiple operands are executed by the same instruction stream (ibid.).”
Discussed below is how picture coding algorithms are processed illustratively by software using the SIMD type processor.
A typical algorithm to which the SIMD type processor is applied advantageously is motion compensation—a technique for removing correlations between frames that are temporally adjacent to one another. MPEG1 and MPEG2, both international standards for moving picture coding, embrace a technique called block matching for motion compensation.
Block matching, what it is and how it works, is outlined below with reference to FIG.
2
.
FIG. 2
is a conceptual view explaining how block matching is typically performed. A current frame
21
is a frame about to be coded. A reference frame
22
is a frame which is temporally close to an image of the current frame and which represents a decoded image of a previously coded frame. To effect block matching requires utilizing a luminance signal alone, or employing both a luminance signal and a chrominance signal. Where software is used for block matching, the luminance signal alone is generally adopted in view of relatively low volumes of operations involved. The description that follows thus assumes the use of the luminance signal alone.
The current frame
21
is first divided into blocks of the same size as indicated by broken lines (each block generally measures 16×16 pixels or 16×8 pixels). A temporal motion between the current frame
21
and the reference frame
22
is then detected for each block. A block
23
is taken up here as an example for describing how a temporal motion is specifically detected. A block
24
is shown spatially in the same position in the reference frame as the block
23
in the current frame. A region represented by the block
24
is moved, its size unchanged, to the position of a block
25
at an integer or half-pixel resolution. Every time such a motion take place, a summation is made of each absolute value of the difference between the blocks
24
and
23
regarding all their pixels. The process is carried out on all motion patterns that may be defined in a predetermined search range (e.g., from pixels −15 to +15 horizontally and pixels −15 to +15 vertically to the block
24
). The motion from the block
24
to the block position representing the smallest summation of each absolute value of the difference therebetween is detected as a motion vector. For example, if the block
25
turns out to be the block representing the smallest summation of each absolute value of the difference, then a vector
26
is detected as a motion vector.
While indispensable for coding, block matching is a technique that requires a huge number of pixel-by-pixel operations (subtractive operations, absolute value operations, additive operations). Illustratively, if the picture size is 176×144 pixels and the block size is 16×16 pixels, the number of divided blocks is 99. In such a case, there are 289 search block patterns for each block provided the search range for block matching is set for ±18 pixels at an integer pixel resolution. It follows that each of the above-mentioned three types of operation needs to be carried out 289×99×256 times (i.e., the number of intra-block pixels). If the picture size is that of the standard television (SDTV), or if the motion search range needs to be enlarged illustratively to accommodate sports-related images, or if the pixel resolution needs to be maintained at a high level during the search, the volume of necessary operations will have to be increased tens- to hundreds-fold. For these reasons, it used to be general practice to employ dedicated hardware for executing block matching. Today, however, advances in processor technology and the emergence of simplified block matching techniques have made it possible for a general purpose processor to carry out the block matching process. As mentioned earlier, SIMD type processors are used advantageously to perform block-by-block processing such as block matching.
A conventional SIMD type parallel processing processor will now be described with reference to FIG.
3
.
FIG. 3
is a block diagram of a conventional parallel processing processor. The processor works as follows: instructions to be executed are first sent from an external memory
130
to an instruction fetch circuit
110
over a processor-to-main memory bus
180
. The instruction fetch circuit
110
includes an instruction memory for instruction storage, a program counter, and an adder for controlling the address in a register in the program counter. The instruction fetch circuit
110
supplies an instruction decoder
120
with the received instructions in the order in which they are to be executed. Every time an instruction is received, the instruction decoder
120
decodes it to find such information as the type of operation, a read address and a write address. The information is transferred to a control circuit
140
and a general purpose register
150
. Each instruction is then processed by the general purpose register
150
, a SIMD type ALU
160
and a data memory
170
according to control information (
141
,
142
,
143
) from the control circuit
140
. For purpose of simplification and illustration, it is assumed that the parallel processing processor shown in
FIG. 3
has four SIMD type ALUs for concurrent processing of four pixels.
Described below is typical processing of block matching by use of the C language and an assembler code.
A C code 1 below is an example in which a block matching algorithm for a block is described in C language. It is assumed that the block size is 16×16 pixels and that a vector (vec_x, vec_y) is detected as representative of a motion vector when a value “error” becomes the smallest.
C code 1: an example of block matching
for(vec_y=−16;vec_y<16;vec_y++)
for(vec_x=−16;vec_x<16;vec_x++){
error = 0;
for(i=0;i<16;i++)
for(j=0;j<16;j++){
error += abs(current(x+j, y+i)
- reference(x+j+vec_x, y+i+vec_y));
/* current current frame, reference : reference
frame */
/* (x, y) : top left pixel position in block */
}
}
where, “for's” are statements in which to describe the loops in C language. The two outer “for” statements spe

Affiliated with

Kimura Jun-ichi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Suzuki Yoshinori

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Mattingly Stanger & Malur, P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Renesas Technology Corp.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tung Kee M.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Parallel processing processor and parallel processing method does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Parallel processing processor and parallel processing method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Parallel processing processor and parallel processing method will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3365326

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure