Method and system for optimally issuing dependent...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Method and system for optimally issuing dependent... Method and system for optimally issuing dependent...

: 1999-06-03
: 2002-12-03
: Kim, Matthew (Department: 2186)
: Electrical computers and digital processing systems: memory
: Storage accessing and control
: Hierarchical memories

: C711S122000, C711S137000
: Reexamination Certificate
: active
: 06490653
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to a superscalar processor and more particularly to optimally issuing dependent instructions in such a system.
BACKGROUND OF THE INVENTION
Superscalar processors employ aggressive techniques to exploit instruction-level parallelism. Wide dispatch and issue paths place an upper bound on peak instruction throughput. Large issue buffers are used to maintain a window of instructions necessary for detecting parallelism, and a large pool of physical registers provides destinations for all of the in-flight instructions issued from the window beyond the dispatch boundary. To enable concurrent execution of instructions, the execution engine is composed of many parallel functional units. The fetch engine speculates past multiple branches in order to supply a continuous instruction stream to the decode, dispatch and execution pipelines in order to maintain a large window of potentially executable instructions.
The trend in superscalar design is to scale these techniques: wider dispatch/issue, larger windows, more physical registers, more functional units, and deeper speculation. To maintain this trend, it is important to balance all parts of the processor—any bottlenecks which diminish the benefit of aggressive techniques.
Instruction fetch performance depends on a number of factors. Instruction cache hit rate and branch prediction accuracy has been long recognized as important problems in fetch performance and is well-researched areas.
Modern microprocessors routinely use a plurality of mechanisms to improve their ability to efficiently fetch past branch instructions. These prediction mechanisms allow a processor to fetch beyond a branch instruction before the outcome of the branch is known. For example, some mechanisms allow a processor to speculatively fetch beyond a branch before the branch's target address has been computed. These techniques use run-time history to speculatively predict which instructions should be fetched and eliminate “dead” cycles that might normally be wasted waiting for the actual determination of the next instruction address. Even with these techniques, current microprocessors are limited in fetching instructions during a clock cycle. As superscalar processors become more aggressive and attempt to execute many more instructions per cycle, they must also be able to fetch many more instructions per cycle.
High performance superscalar processor organizations divide naturally into an instruction fetch mechanism and an instruction execution mechanism. The fetch and execution mechanisms are separated by instruction issue buffer(s), for example, queues, reservation stations, etc. Conceptually, the instruction fetch mechanism acts as a “producer” which fetches, decodes, and places instructions into a reorder buffer. The instruction execution engine “prepares” instructions for completions. The completion engine is the “consumer” which removes instructions from the buffer and executes them, subject to data dependence and resource constraints. Control dependencies (branches and jumps) provide a feedback mechanism between the producer and consumer.
Dispatching and completion of instructions are typically in program order. However, issuance and execution are not necessarily in program order. An instruction is dispatched to an issue queue for a particular execution unit, or at least a particular type of execution unit (aka functional unit). A load/store unit is a type of functional unit for executing memory accesses. An issue queue issues an instruction to its functional unit responsive to the instruction's operands being available for execution, i.e., when results are available from any earlier dispatched instructions upon which the instruction is dependent.
SUMMARY OF THE INVENTION
In a high-speed highly speculative processor, groups of instructions are issued based on interdependencies. Some operations such as Load instructions can have variable and unpredictable latency which makes interdependency analysis difficult. A solution is needed that improves the performance of instruction groups dependent on Load operands. More particularly, what is needed is a system and method for efficiently issuing dependent instructions in such a processor. The present invention addresses such a need.
A method for optimally issuing instructions that are related to a first instruction in a data processing system is disclosed. The processing system includes a primary and secondary cache. The method and system comprises speculatively indicating a hit of the first instruction in a secondary cache and releasing the dependent instructions. The method and system includes determining if the first instruction is within the secondary cache. The method and system further includes providing data related to the first instruction from the secondary cache to the primary cache when the instruction is within the secondary cache.
A method and system in accordance with the present invention causes instructions that create dependencies (such as a load instruction) to signal an issue queue (which is responsible for issuing instructions with resolved conflicts) in advance, that the instruction will complete in a predetermined number of cycles. In an embodiment, a core interface unit (CIU) will signal an execution unit such as the Load Store Unit (LSU) that it is assumed that the instruction will hit in the L
2
cache. An issue queue uses the signal to issue dependent instructions at an optimal time. If the instruction misses in the L
2
cache, the cache hierarchy causes the instructions to be abandoned and re-executed when the data is available.

REFERENCES:
patent: 5471598 (1995-11-01), Quattromani et al.
patent: 5584009 (1996-12-01), Garibay, Jr. et al.
patent: 5596731 (1997-01-01), Martinez, Jr. et al.
patent: 5737590 (1998-04-01), Hara

Affiliated with

Cargnoni Robert Alan

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Ronchetti Bruce Joseph

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shippy David James

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Thatcher Larry Edward

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Anderson Matthew D.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kim Matthew

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Salys Casimer K.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sawyer Law Group LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for optimally issuing dependent... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for optimally issuing dependent..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for optimally issuing dependent... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2920299

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure