Electrical computers and digital processing systems: processing – Processing control – Branching
Reexamination Certificate
1998-07-14
2001-04-03
Vu, Viet D. (Department: 2758)
Electrical computers and digital processing systems: processing
Processing control
Branching
C712S245000
Reexamination Certificate
active
06212629
ABSTRACT:
The following U.S. patents and applications are assigned to the assignee of the present invention and are incorporated herein by reference for all purposes:
U.S. Pat. No. 5,226,126, issued Jul. 06, 1993, for PROCESSOR HAVING PLURALITY OF FUNCTIONAL UNITS FOR ORDERLY RETIRING OUTSTANDING OPERATIONS BASED UPON ITS ASSOCIATED TAGS (the '126 patent);
U.S. Pat. No. 5,394,351, issued Feb. 28, 1995, for OPTIMIZED BINARY ADDER AND COMPARATOR HAVING AN IMPLICIT CONSTANT FOR AN INPUT;
U.S. Pat. No. 5,230,068, issued Jul. 20, 1993, for CACHE MEMORY SYSTEM FOR DYNAMICALLY ALTERING SINGLE CACHE MEMORY LINE AS EITHER BRANCH TARGET ENTRY OR PRE-FETCH INSTRUCTION QUEUE BASED UPON INSTRUCTION SEQUENCE;
U.S. Pat. No. 5,226,130, issued Jul. 06, 1993, for METHOD AND APPARATUS FOR STORE-INTO-INSTRUCTION-STREAM DETECTION AND MAINTAINING BRANCH PREDICTION CACHE CONSISTENCY;
U.S. Pat. No. 5,163,140, issued Nov. 10, 1992, for TWO-LEVEL BRANCH PREDICTION CACHE;
U.S. Pat. No. 5,093,778, issued Mar. 03, 1992, for INTEGRATED SINGLE-STRUCTURE BRANCH PREDICTION CACHE;
U.S. application Ser. No. 08/340,183, filed Nov. 15, 1994, for IMPROVED PIPELINE THROUGHPUT VIA PARALLEL OUT-OF-ORDER EXECUTION OF ADDS AND MOVES IN A SUPPLEMENTAL INTEGER EXECUTION UNIT;
U.S. application Ser. No. 08/185,488, filed Jan, 21, 1994, for SUPERSCALAR EXECUTION UNIT FOR SEQUENTIAL INSTRUCTION POINTER UPDATES AND SEGMENT LIMIT CHECKS INTEGER EXECUTION UNIT;
U.S. application Ser. No. 08/405,268, filed Mar. 13, 1995, for CACHE CONTROL SYSTEM; and
U.S. application Ser. No. 08/403,011, filed Mar. 10, 1995, now U.S. Pat. No. 5,583,806 for OPTIMIZED BINARY ADDER FOR CONCURRENTLY GENERATING EFFECTIVE AND INTERMEDIATE ADDRESSES.
A microfiche appendix comprising two sheets with 155 frames is included in the application.
BACKGROUND OF THE INVENTION
The present invention relates generally to computers, and more particularly to techniques for advanced processor design and operation.
The explosion of the Personal Computer industry has until recently been fueled primarily by the 68000 family incorporated into most Apple Macintosh personal computers and the x86 family incorporated into most IBM-PC compatible products (PCs). The IBM-PC quickly achieved a dominant position due to its open architecture that allowed a host of vendors to make compatible peripherals and system units. The PC is now a mass-market consumer item.
The initial IBM PC was designed around the 8088, an 16-bit internal, 8-bit external bus, microprocessor manufactured by Intel. Subsequent advances in microprocessor fabrication and design were incorporated into later microprocessors developed by NexGen, AMD, Cyrix, Intel, and others. Each of these microprocessors became the engines for more advanced versions of the IBM-compatible PC.
Early on it was recognized that the x86 basic architecture had a number of technical limitations. A number of these limitations are related to the x86's use of a CISC (Complex Instruction Set Computer) Architecture. The CISC architecture requires that the processor hardware be built to execute a large number of complex instructions. Many of these instructions are infrequently used but have design consequences that slow down all instructions, due to, for example, complexities that must be introduced in the decoder and datapath timing. The hardware needed to implement these infrequently used instructions results in a poor use of silicon resources—resources which could be better used doing such things as aggressive instruction prefetch. Advanced microprocessor techniques such as pipelining and superscalar decoding are difficult to implement on a CISC-type architecture. Furthermore, it is difficult to design optimizing compilers that make effective use of more than just a subset of the CISC instructions. As a result, CISC processors are often not able to enjoy the benefits of advanced static instruction scheduling.
In addition, the x86 architecture includes a number of limitations above and beyond those inherent in CISC. Among these additional limitations are a limited number of on-chip registers, variable length instructions included in the instruction set, non-consistent field encodings, and a requirement that interrupts be precise (be generated and handled a determined number of instructions from the instruction that caused the interrupt). Additionally, the x86's segmented memory architecture, protection mode features, and compatible paging make it especially difficult to apply advanced microprocessor techniques.
Some aspects related to the x86 architecture are discussed in two U.S. Patents assigned to Intel: U.S. Pat. Nos. 4,972,338 and 5,321,836, which are incorporated herein by reference to the extent necessary to understand those parts of this disclosure related to the x86 architecture.
With respect to the microprocessors disclosed in '338 and '836, pipelining is really not done in the contemporary sense. However, instruction fetch and instruction execution are loosely coupled, permitting parallel instruction fetch and execution. Furthermore, microprocessors designed during the associated period typically relied heavily on microcode implementations that resulted in multiple cycles per instruction to execute and had a single conventional register file in the execution unit. Additional limitations include that the address generation is implemented with sequential generation of effective and intermediate addresses and the bus interface has a flat memory hierarchy with no explicit cache subsystem.
The Intel patents also discuss an architecture that is lacking in many advanced features. The architecture is scalar, meaning that there is just one integer add unit, for example and that it must be accessed over a number of cycles to execute one instruction so that all of the adds associated with address computation, operand fetching, and the operation on the operand can be performed. It is believed that there is no pipeline within any of the functional blocks of the microprocessors discussed in the patents and no queues other that for instruction fetch. Execution of instructions is in-order and the use of hardware in the processor is sequential.
Despite these numerous limitations, the x86 has remained the industry standard primarily due to market factors. There currently exists in the market a massive installed base of both hardware and software compatible with the x86. Virtually all PC software is distributed in binary form only, so that end users can not recompile their software to target different architectures. The typical user has a large investment in all software types: operating systems (OS), applications programs, device drivers, and utilities. Further major investments are likely to exist in multiple system units and compatible peripherals. As a result, the cost of switching to a new architecture can be quite daunting.
What is needed is a collection of microprocessor structures and techniques that permit an advanced microprocessor to maintain compatibility with the large installed base of x86 hardware and software while overcoming the limitations of the x86 architecture to provide performance competitive to other contemporary processors implementing other architectures.
SUMMARY OF THE INVENTION
The present invention extracts to a high degree the available inter-instruction and intra-instruction parallelism present in the existing complex-instruction software, while offering competitive execution speed, via dynamic instruction scheduling of existing complex-instruction binaries, to that available through static instruction scheduling via recompilation, of the corresponding complex-instruction source code. The present invention extends the competitive lifetime of the industry standard X86 architecture through pervasive application of RISC techniques throughout all levels of processor microarchitecture.
The present invention provides semi-autonomous RISC pipelines that perform overlapped execution of RISC-like instructions within the multiple superscalar execution units of a processor having distributed pipeline control for sp
Cargnoni Robert A.
Favor John Gregory
Greenley Dale R.
McFarland Harold L.
Mehta Shrenik
Advanced Micro Devices , Inc.
Townsend and Townsend / and Crew LLP
Vu Viet D.
LandOfFree
Method and apparatus for executing string instructions does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for executing string instructions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for executing string instructions will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2440260