Electrical computers and digital processing systems: processing – Instruction fetching – Of multiple instructions simultaneously
Patent
1997-10-07
2000-07-18
Pan, Daniel H.
Electrical computers and digital processing systems: processing
Instruction fetching
Of multiple instructions simultaneously
712208, 712215, 712 41, G06F 930, G06F 940
Patent
active
060921811
ABSTRACT:
A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instructions in-order.
REFERENCES:
patent: 3346851 (1967-10-01), Thornton et al.
patent: 3771138 (1973-11-01), Celtruda et al.
patent: 3789365 (1974-01-01), Jen et al.
patent: 4034349 (1977-07-01), Monaco et al.
patent: 4200927 (1980-04-01), Hughes et al.
patent: 4228495 (1980-10-01), Bernhard et al.
patent: 4296470 (1981-10-01), Fairchild et al.
patent: 4315314 (1982-02-01), Russo
patent: 4410939 (1983-10-01), Kawakami
patent: 4434461 (1984-02-01), Puhl
patent: 4459657 (1984-07-01), Murao
patent: 4476525 (1984-10-01), Ishii
patent: 4626989 (1986-12-01), Torii
patent: 4675806 (1987-06-01), Uchida
patent: 4714994 (1987-12-01), Oklobdzija et al.
patent: 4722049 (1988-01-01), Lahti
patent: 4752873 (1988-06-01), Shonai et al.
patent: 4758948 (1988-07-01), May et al.
patent: 4766566 (1988-08-01), Chuang
patent: 4807115 (1989-02-01), Torng
patent: 4858105 (1989-08-01), Kuriyama et al.
patent: 4897810 (1990-01-01), Nix
patent: 4901228 (1990-02-01), Kodama
patent: 4903196 (1990-02-01), Pomerene et al.
patent: 4924376 (1990-05-01), Ooi
patent: 4926323 (1990-05-01), Baror et al.
patent: 4942525 (1990-07-01), Shintani et al.
patent: 4985825 (1991-01-01), Webb, Jr. et al.
patent: 4992938 (1991-02-01), Coke et al.
patent: 5003462 (1991-03-01), Blaner et al.
patent: 5101341 (1992-03-01), Circella et al.
patent: 5127091 (1992-06-01), Bonford et al.
patent: 5226126 (1993-07-01), McFarland et al.
patent: 5226170 (1993-07-01), Rubinfeld
patent: 5230068 (1993-07-01), Van Dyke et al.
patent: 5355460 (1994-10-01), Eickenmeyer et al.
patent: 5390355 (1995-02-01), Horse
patent: 5442757 (1995-08-01), McFarland et al.
patent: 5487156 (1996-01-01), Popescu et al.
patent: 5539911 (1996-07-01), Nguyen et al.
patent: 5561776 (1996-10-01), Popescu et al.
patent: 5574927 (1996-11-01), Scantlin
patent: 5592636 (1997-01-01), Popescu et al.
patent: 5625837 (1997-04-01), Popescu et al.
patent: 5627983 (1997-05-01), Popescu et al.
patent: 5651125 (1997-07-01), Witt et al.
patent: 5689720 (1997-11-01), Nguyen et al.
patent: 5708841 (1998-01-01), Popescu et al.
patent: 5768575 (1998-06-01), McFarland et al.
patent: 5778210 (1998-07-01), Henstrom et al.
patent: 5797025 (1998-08-01), Popescu et al.
patent: 5832205 (1998-11-01), Kelly et al.
patent: 5832293 (1998-11-01), Popescu et al.
Smith, M.D. et al., "Boosting Beyond Static Scheduling in a Superscalar Processor," IEEE, 1990, pp. 344-354.
Murakami, K. et al., "SIMP (Single Instruction stream/Multiple instruction Pipelining): A Novel High-Speed Single-Processor Architecture," ACM, 1989, pp. 78-85.
Jouppi, N.P., "The Nonuniform Distribution of Instruction-Level and Machine Parellelism and Its Effect on Performance," IEEE Transactions on Computers, vol. 38, No. 12, Dec. 1989, pp. 1645-1658.
Horst, R.W. et al., "Multiple Instruction Issue in the NonStop Cyclone Processor," IEEE, 1990, pp. 216-226.
Goodman, J.R. and Hsu, W., "Code Scheduling and Register Allocation in Large Basic Blocks," ACM, 1988, pp. 442-452.
Lam, M.S., "Instruction Scheduling For Superscalar Architectures," Annu. Rev. Comput. Sci., vol. 4, 1990, pp. 173-201.
Aiken, A. and Nicolau, A., "Perfect Pipelining: A New Loop Parallelization Technique*", pp. 221-235.
Jouppi, N.H., "Integration and Packaging Plateaus of Processor Performance," IEEE, 1989, pp. 229-232.
Groves, R.D. and Oehler, R., "An IBM Second Generation RISC Processor Architecture," IEEE, 1989, pp. 134-137.
Smith et al., "Implementation of Precise Interrupts in Pipelined Processors," Proceedings of the 12th Annual International Sympsium on Computer Architecture, Jun. 1985, pp. 36-44.
Wedig, R.G., Detection of Concurrency In Directly Executed Language Instruction Streams, (Dissertation), Jun. 1982, pp. 1-179.
Agerwala et al., "High Performance Reduced Instruction Set Processors," IBM Research Division, Mar. 31, 1987, pp. 1-61.
Gross et al., "Optimizing Delayed Branches," Proceedings of the 5th Annual Workshop on Microprogramming, Oct. 5-7, 1982, pp. 114-120.
Tjaden et al., "Representation of Concurrency with Ordering Matrices," IEEE Trans. On Computers, vol. C-22, No. 8, Aug. 1973, pp. 752-761.
Tjaden, Representation and Detection of Concurrency Using Ordering Matrices, (Dissertation), 1972, pp. 1-199.
Foster et al., "Percolation of Code to Enhance Parallel Dispatching and Execution," IEEE Trans. On Computers, Dec. 1971, pp. 1411-1415.
Thornton, J.E., Design of a Computer: The Control Data 6600, Control Data Corporation, 1970, pp. 58-140.
Weiss et al., "Instruction Issue Logic in Pipelined Supercomputers," Reprinted from IEEE Trans. on Computers, vol. C-33, No. 11, Nov. 1984, pp. 1013-1022.
Tomasulo, R.M., "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal, vol. 11, Jan. 1967, pp. 25-33.
Tjaden et al., "Detection and Parallel Execution of Independent Instructions," IEEE Trans. On Computers, vol. C-19, No. 10, Oct. 1970, pp. 889-895.
Smith et al., "Limits on Multiple Instruction Issue," Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, pp. 290-302.
Pleszkun et al., "The Performance Potential of Multiple Functional Unit Processors," Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 37-44.
Pleszkun et al., "WISQ: A Restartable Architecture Using Queues," Proceedings of the 14th International Symposium on Computer Architecture, Jun. 1987, pp. 290-299.
Patt et al., "Critical Issues Regarding HPS, A High Performance Microarchitecture," Proceedings of the 18th Annual Workshop on Microprogramming, Dec. 1985, pp. 109-116.
Hwu et al., "Checkpoint Repair for High-Performance Out-of-Order Execution Machines," IEEE Trans. On Computers, vol. C-36, No. 12, Dec. 1987, pp. 1496-1514.
Patt et al., "HPS, A New Microarchitecture: Rationale and Introduction," Proceedings of the 18th Annual Workshop on Microprogramming, Dec. 1985, pp. 103-108.
Keller, R.M., "Look-Ahead Processors," Computing Surveys, vol. 7, No. 4, Dec. 1975, pp. 177-195.
Jouppi et al., "Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines," Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 1989, pp. 272-282.
Hwu et al., "HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality," Proceedings from ISCA-13, Tokyo, Japan, Jun. 2-5, 1986, pp. 297-306.
Hwu et al., "Exploiting Parallel Microprocessor Microarchitectures with a Compiler Code Generator," Proceedings of the 15th Annual Symposium on Computer Architecture, Jun. 1988, pp. 45-53.
Colwell et al., "A VLIW Architecture for a Trace Scheduling Compiler," Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 1987, pp. 180-192.
Uht, A.K., "An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code," Proceedings of the 19th Annual Hawaii International Conference on System Sciences, 1986, pp. 41-50.
Charlesworth, A.E., "An Approach to Scientific Array Processing: The Architectural Design of the AP-102B/FPS-164 Family," Computer, vol. 14, Sep. 1981, pp. 18-27.
Acosta, Raymond D. et al., "An Instruction Is
Garg Sanjiv
Hagiwara Yasuaki
Lau Te-Li
Lentz Derek J.
Miyayama Yoshiyuki
Pan Daniel H.
Seiko Epson Corporation
LandOfFree
High-performance, superscalar-based computer system with out-of- does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High-performance, superscalar-based computer system with out-of-, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High-performance, superscalar-based computer system with out-of- will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2049119