Electrical computers and digital processing systems: processing – Processing architecture – Array processor
Reexamination Certificate
1999-03-08
2001-10-23
Hjerpe, Richard (Department: 2783)
Electrical computers and digital processing systems: processing
Processing architecture
Array processor
C712S028000, C712S011000, C712S010000, C712S022000
Reexamination Certificate
active
06308251
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a parallel processor apparatus, more particularly relates to a parallel processor apparatus capable of reducing the power consumption and improving the operating speed when converting serial data to parallel data or when converting parallel data to serial data.
2. Description of the Related Art
Where digitally processing image data, there are many cases in which similar processing is applied to all of the pixel data composing an image.
In order to execute similar processing with respect to a plurality of data at a high speed, a parallel processor apparatus adopting a single instruction multiple data stream (SIMD) type architecture has been proposed. This is being utilized in a wide range of fields not limited to image data processing.
In an SIMD type architecture, exactly the required number of processor elements are arranged. Each processor element operates according to the same command.
Accordingly, when different data are simultaneously given to the processor elements, the results of processing with respect to the individual data are simultaneously obtained. As a parallel processor apparatus adopting such an SIND type architecture applied to image data processing, there is for example known the parallel processor apparatus shown in SVP (SERIAL VIDEO PROCESSOR, Proceedings of the IEEE 1990 CUSTOM INTEGRATED CIRCUITS CONFERENCE, p. 17, 3.1 to 4).
This parallel processor apparatus, as shown in for example
FIG. 11
, is provided with a data input register
102
, processor elements (PE)
3
1
to
3
n
and a data output register
104
.
Here, the data input register
102
sequentially receives as input one sweep line's worth of pixel data as serial data S
IN
and outputs the one sweep line's worth of pixel data to the individual processor elements
3
1
to
3
n
. The processor elements
3
1
to
3
n
respectively process the one sweep line's worth of pixel data in parallel. The data output register
104
receives as input the processed one sweep line's worth of pixel data in parallel from the processor elements
3
1
to
3
n
and sequentially outputs the same as the serial data S
OUT
.
The routine for processing image data composed by pixel data of m×n number of pixels p(1,1) to p(m,n) arranged in the form of a matrix as shown in
FIG. 12
in such a parallel processor apparatus
101
will be explained below by referring to
FIGS. 14A
to
14
C.
Here, the pixel data of the pixel p(i,j) of any i,j (where 1≦i≦m, 1≦j≦n) can be expressed by using a plurality of bits.
In
FIG. 12
, pixels are usually swept in order from the left to the right and from the top to Ebottom, therefore the image data are generally transmitted in the format as shown in FIG.
13
. Here, the period for sweeping one line's worth of pixel data will be referred to as a “horizontal sweep duration”. Further, the period for the sweep to return from a right end of a certain line of the screen to a left end of a next line will be referred to as a “horizontal blanking duration”. For example, there is a horizontal blanking duration between the pixel p(i,n) of the right end of an i-th line and the pixel p(i+1,1) of the left end of the next line.
In
FIGS. 14A
to
14
C, for example, image data comprised of pixel data composed in turn of a plurality of bits are sequentially input to the input terminals of the processor elements in units of the pixel data. The pixel data of the first line are stored in a data input register
102
shown in
FIG. 11
having a storage capacity of one line's worth of the image data in a first horizontal sweep duration (S
1
). Then, the pixel data of the first line stored in the data input register
102
are sequentially output to the processor elements
3
1
to
3
n
within the next horizontal blanking duration (S
2
) so that one pixel's worth of the pixel data is supplied to one processor element.
Next, in the horizontal sweep duration (S
3
), a each of the processor elements
3
1
to
3
n
performs processing with respect to the supplied one line's worth of the pixel data. Further, simultaneously with this, pixel data of the second line are sequentially input to the data input register
102
. Then, within the succeeding horizontal blanking duration (S
4
), the processed pixel data of the first line are supplied from the processor elements
3
1
to
3
n
to the data output register
104
in parallel. Simultaneously with this, the pixel data of the second line are supplied from the data input register
102
to the processor elements
3
1
to
3
n
in parallel. Then, in the next horizontal sweep duration (S
5
), the pixel data of the first line stored in the data output register
104
are sequentially output to the output terminals of the processor elements. Simultaneously with this, the processor elements
3
1
to
3
n
process the pixel data of the second line, and the pixel data of a third line are sequentially input to the data input register
102
.
After this, when the processor elements
3
1
to
3
n
process the pixel data of the i-th line, the operation of the data input register
102
receiving as input the pixel data of the (i+1)th line and the data output register
104
outputting the pixel data of the (i−1)th line is repeated. In this way, the data input register
102
, the processor elements
3
1
to
3
n
, and the data output register
104
operate in synchronization, whereby image data processed for every horizontal sweep duration is output.
Below, a detailed explanation will be given of the data input register
102
.
As shown in
FIG. 15
, the data input register
102
is constituted by a pointer circuit
105
and a conversion circuit
6
.
The pointer circuit
105
can be constituted by using a shift register widely used when performing mutual conversion of serial data and parallel data. The pointer circuit
105
is structured with the unit delay elements
26
1
to
26
n
such as D-type flip-flops connected in series, receives as its input a pointer control signal S
1
comprised by a clock signal S
11
and pointer data S
12
, and outputs pointer data S
26
1
to S
26
n
to the conversion circuit
6
.
The conversion circuit
6
has arranged in parallel first switching means
30
1
to
30
n
, memories
31
1
to
31
n
and second switching means
32
1
to
32
n
, receives as its inputs serial data S
IN
, pointer data S
26
1
to S
26
n
, and a switch control signal S
9
, and outputs parallel data S
6
comprised by data S
6
1
to S
6
n
to the processor elements
3
1
to
3
n
.
Here, the signal line of the serial data S
IN
and the signal lines of the parallel data S
6
have widths or numbers of bits sufficient for expressing the data of one pixel.
The operation of the data input register
102
will be explained next by referring to
FIGS. 16A
to
16
C.
In the conversion circuit
6
, among the switching means
30
1
to
30
n
, the switching means
30
x
with a logic “1” of the pointer data S
26
1
to S
26
n
become an ON state and store the corresponding pixel data among the sequentially input serial data S
IN
in the memory
31
x
. Namely, in synchronization with the logical value “1” being given to the pointer data S
12
for only a first clock cycle in the horizontal sweep duration and a pulse being given to the clock signal S
11
. If the pixel data of the pixels p(i,1) to p(i,n) of for example the i-th line are sequentially given as the serial data S
IN
, one line's worth of pixel data is respectively stored in the memories
31
1
to
31
n
.
First, the pointer data S
26
1
indicates the logical value “1”, as shown in
FIG. 16A
, the switching means
30
1
becomes the ON state, and the pixel data of the pixel p(i,1) is stored in the memory
31
1
. At this time, the pointer data S
26
2
to S
26
n
indicate a logical value “0”, and the switching means
30
2
to
30
n
become an OFF state.
Next, the pointer data S
26
2
indicates the logical value “1”, as shown in
FIG. 16B
, the switching means
30
2
becomes the ON state
Frommer William S.
Frommer Lawrence & Haug LLP.
Hjerpe Richard
Monestime Mackly
Shallenburger Joe H.
LandOfFree
Reduced power parallel processor apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Reduced power parallel processor apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reduced power parallel processor apparatus will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2603550