Read/write alignment scheme for port reduction of multi-port...

Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Read/write alignment scheme for port reduction of multi-port... Read/write alignment scheme for port reduction of multi-port...

: 2001-04-03
: 2004-08-31
: Padmanabhan, Mano (Department: 2188)
: Electrical computers and digital processing systems: memory
: Storage accessing and control
: Shared memory area

: C711S150000, C711S168000, C711S169000, C365S189040, C365S230050
: Reexamination Certificate
: active
: 06785781
: ABSTRACT:

PRIOR FOREIGN APPLICATION
This application claims priority from European patent application number 00108699.0, filed Apr. 20, 2000, which is hereby incorporated herein by reference in its entirety.
TECHNICAL FIELD
The present invention relates to improvement of storage devices in computer systems and in particular, it relates to an improved method and system for efficiently accessing multi-port cell array circuitry.
BACKGROUND ART
In modern computer processor architecture development an increasing portion of processor work is still continued to be parallelized. During parallelization an increasing number of processing sub-units should be allowed and be enabled to access one and the same storage location in order to be able to compute as quickly as possible. Thus, such a storage location requires multiple read/write accessibility.
An example is out-of-order processing. Writing data into arrays of such storage locations in parallel from multiple sources, or reading data from arrays in parallel to multiple targets then requires multi-port cells.
The area and performance of such an array is mainly determined by the number of ports per cell and not by the data size to be stored. More precisely, the area consumption of such an array is nearly proportional to the square of the number of ports implemented.
As one storage cell needs m read ports in order to be readable concurrently by a number of m different reading targets and it needs a number of n write ports for n write sources to write in the cell, and each port comprises a pair of a respective data line and select line being orthogonal to each other, the area consumption increases remarkably with increasing m, or n. For example, when in a m=n=1, two ports case a given array has an area consumption of X, and the array should now be replaced by a multiple access array of m=n=4, 8 ports, then, the resulting area consumption is about (8×8)/(2×2)=16 times higher, i.e., 16 ×. Thus, increasing parallelization requires a large additional area consumption on any processor chip.
Although the present invention has a broad field of application as improving or optimizing storage strategies is a very general purpose in computer technology, it will be described and discussed with prior art technology in a special field of application, namely in context of utilizing a so-called instruction window buffer, further abbreviated as IWB, which is usually present in most modern computer systems in order to enable a parallel program processing of instructions by a plurality of processing units. Such processors are referred to herein as out-of-order processors.
In many modern out-of-order processors such a buffer is used to contain all the instructions and/or register contents before the calculated results can be committed and removed from the buffer. When results were calculated speculatively beyond the outcome of a branch instruction, they can be rejected once the branch prediction becomes wrong just by simply cleaning these entries from the buffer and overwriting them with new correct instructions. This is one prerequisite for the out-of-order processing. One main parameter influencing the performance of the processors is the buffer size: A big buffer can contain many more instructions and results and therefore allows more out-of-order processing. One design objective therefore is to have a big buffer. This however stays in conflict with other design requirements such as cycle time, buffer area, etc. When, for example, the buffer size is dimensioned too large then the efforts required to manage such a large plurality of storage locations decreases the performance of the buffer. Furthermore, increased buffer size implies an increased signal propagation delay. Thus, generally, any improved storage method has to find a good compromise between the parameters buffer size, storage management and therewith storage access speed.
The present invention primarily covers the buffer size and the associated signal propagation delay.
A prior art instruction window buffer as it is disclosed in U.S. Pat. No. 5,923,900, “Circular Buffer With N Sequential Real And Virtual Entry Positions For Selectively Inhibiting N Adjacent Entry Positions Including The Virtual Entry Position”, which is hereby incorporated herein by reference in its entirety, is operated according to the following write/read schemes:
With reference to
FIG. 1
(prior art), in order to write a package of instructions as depicted in the upper portion of the figure, for example a package of 4 unresolved instructions uip(
0
:
3
), into an array in one cycle during the dispatch process a cell is needed with as many write ports as the maximum package size, i.e., a number of k
1
=4 in this case.
A write decode block
22
translates the write address in (
0
:
5
) via control line
16
, into input pointer wse
10
. . . wse
13
(
0
:
3
) selecting a block of four entries to be written, namely the array entries i, i+1, i+2, i+3. This is depicted schematically in FIG.
1
. The first instruction uip
0
is written into cell(i) by activating wse
10
on input port di
0
, the next instruction uip
1
is written into cell(i+1) by activating wsel
1
on input port di
1
, and so on, see the filled circles.
This scheme guarantees that the data is written consecutively into the array. As buffer memories in general are often used in a wrap-around way of operation some special care is required to cover this case, too.
The wrap-around case is handled by the write decoder
22
, as well. If for example the window buffer has the total size of
64
entries and a block of four subsequent entries is intended to be written in starting at
62
, then, wse
1
(
0
:
3
) point to entries (
62
,
63
,
0
,
1
).
The read case is similar as revealed from
FIG. 2
which depicts the prior art issue filters if
0
to if
3
controlling an array of 4-read-port cells by read select lines rsel
0
(
0
. . .
63
), rsel
1
(
0
. . .
63
), rsel
2
(
0
. . .
63
), rsel
3
(
0
. . .
63
). The data is read to several data output ports, i.e. Do(
0
:
3
) not explicitly depicted. As many read ports are needed as execution units exists, i.e., instruction execution units (ieu) ieu(
0
:
3
) in order to get full parallelism and provide data for all execution units every cycle for the issue process. A routing network can connect each output port of the buffer with each execution unit. An arbitration logic is provided for connecting a particular port with the desired execution unit.
In particular, the instructions ready for execution are identified by valid bits depicted in the upper line of
FIG. 2
which are passed to the four different issue filters if(
0
:
3
). if
0
selects the oldest of all instructions
0
. . .
63
ready for execution, activates rse
10
and thereby sends the data to the execution units. Filter if
1
ignores the entry detected by if
0
and selects the second oldest, activates rsel
1
and sends it to the execution units, and so on.
Since any entry of the 64 total entries of the buffer can be first, second, third or fourth selected, any entry and therefore any cell needs 4 read ports. This results in an extremely high area consumption and an associated large signal propagation delay.
SUMMARY OF THE INVENTION
It is thus an objective of the present invention to decrease area consumption and thus increase the efficiency of storage area utilization.
This objective of the invention is achieved by the features stated in enclosed independent claims. Further advantageous arrangements and embodiments of the invention are set forth in the respective subclaims.
A considerable amount of area can be saved according to the present invention by reducing the number of write ports to the number k
1
of concurrently intended write accesses and the number of output ports to the number k
2
of concurrently intended read accesses to the array. This remarkable reduction of ports and thus an extraordinary associated area saving can be achieved when the intended array ‘natural’ operation

Affiliated with

Leenstra Jens

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Pille Juergen

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sautter Rolf

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Wendel Dieter

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Augspurger, Esq. Lynn L.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Heslin Rothenberg Farley & & Mesiti P.C.

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

International Business Machines - Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Padmanabhan Mano

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Schiller, Esq. Blanche E.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Read/write alignment scheme for port reduction of multi-port... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Read/write alignment scheme for port reduction of multi-port..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Read/write alignment scheme for port reduction of multi-port... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3358878

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure