Electrical computers and digital processing systems: memory – Storage accessing and control – Specific memory composition
Reexamination Certificate
2001-03-08
2003-10-07
Nguyen, T. V. (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Specific memory composition
C711S101000, C711S105000, C711S170000, C711S171000, C711S172000
Reexamination Certificate
active
06631439
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to a novel VLIW computer processing architecture, and more particularly to a processor having a scalable multi-pipeline processing core and memory fabricated on the same integrated circuit.
Computer architecture designers are constantly trying to increase the speed and efficiency of computer processors. However, conventional “state-of-the-art” CPU designs are predicated on the fact that there is a huge latency inherent in the accompanying memory systems, coupled with limited bandwidth communications between the memory systems and the CPU core. These inherent problems with current processor and memory latencies have led to computer architecture designs with many and large cache layers and highly complex designs—each additional fraction of design complexity obtaining only a small improvement in performance (i.e., diminishing returns).
For example, computer architecture designers have attempted to increase processing speeds by increasing clock speeds and attempting latency hiding techniques, such as data pre-fetching and cache memories. In addition, other techniques, such as instruction-level parallelism using very long instruction word (VLIW) designs, and embedded-DRAM have been attempted.
Combining memory (i.e., DRAM) and logic on the same chip appears to be an excellent way to improve internal memory bandwidth and reduce memory access latencies at a low cost. However, DRAM circuits tend to be sensitive to temperature and thermal gradients across the silicon die. Conventional RISC and CISC CPUs, because they must be clocked at high speeds to attain adequate performance, are necessarily energy inefficient and tend to produce a large amount of heat, which ultimately affects the performance of any DRAM residing on the same chip. Thus, architectures which attain their performance through instruction-level parallelism, instead of maximizing clock speeds, tend to be better suited for use with on-chip DRAM because they can exploit the large communication bandwidth between the processor and memory while operating at lower clock speeds and lower supply voltages. Examples of architectures utilizing instruction-level parallelism include single instruction multiple data (SIMD), vector or array processing, and very long instruction word (VLIW). Of these, VLIW appears to be the most suitable for general purpose computing.
Certain VLIW computer architecture designs are currently known in the art. However, while processing multiple instructions simultaneously may help increase processor performance, it is difficult to process a large number of instructions in parallel because of instruction dependencies on other instructions. In addition, most VLIW processors require extremely complex logic to implement the VLIW design, which also slows the performance of VLIW processors. In fact, with VLIW designs which do not take advantage of the memory efficiencies with on-chip DRAM, the average number of instructions per clock (IPC) can drop well below 1 when factors such as branch miss-prediction, cache misses, and instruction fetch restrictions are factored in. Thus, what is needed is a novel, high performance computer processing architecture to overcome the shortcomings of the prior art.
SUMMARY OF THE INVENTION
One embodiment of the present invention comprises a processor chip including a processing core, at least one bank of DRAM memory, an I/O link configured to communicate with other like processor chips or compatible I/O devices, and a communication and memory controller in electrical communication with the processing core, the at least one bank of DRAM memory, and the I/O link. The communication and memory controller is configured to control the exchange of date between the processor chip and the other processor chips or I/O device. The communication and memory controller also is configured to receive memory requests from the processing core, and the other processor chips via the I/O link, and process the memory requests with the at least one bank of DRAM memory.
In accordance with another embodiment of the present invention, the communication and memory controller comprises a memory controller in electrical communication with the processing core and the at least one bank of DRAM memory, and a distributed shared memory controller in electrical communication with the memory controller and the I/O link. The distributed shared memory controller is configured to control the exchange of data between the processor chip and the other processor chips or I/O devices. In addition, the memory controller is configured to receive memory requests from the processing core and the distributed shared memory controller, and process the memory requests with the at least one bank of DRAM memory.
In accordance with yet another embodiment of the present invention, the processor chip may further comprise an external memory interface in electrical communication with the communication and memory controller. In accordance with this aspect of the present invention, the external memory interface is configured to connect the processor chip in electrical communication with external memory. The communication and memory controller is configured to receive memory requests form the processing core and from the other processing chips via the I/O link, determine whether the memory requests are directed to the at least one bank of DRAM memory on the processor chip or the external memory, and process the memory requests with the at least one bank of DRAM memory on the processor chip of with the external memory through the external memory interface.
A more complete understanding of the present invention may be derived by referring to the detailed description of preferred embodiments and claims when considered in connection with the figures.
REFERENCES:
patent: 4725945 (1988-02-01), Kronstadt et al.
patent: 4894770 (1990-01-01), Ward et al.
patent: 4980819 (1990-12-01), Cushing et al.
patent: 5184320 (1993-02-01), Dye
patent: 5261066 (1993-11-01), Jouppi et al.
patent: 5301340 (1994-04-01), Cook
patent: 5317718 (1994-05-01), Jouppi
patent: 5386547 (1995-01-01), Jouppi
patent: 5530817 (1996-06-01), Masubuchi
patent: 5564035 (1996-10-01), Lai
patent: 5588130 (1996-12-01), Fujishima et al.
patent: 5623627 (1997-04-01), Witt
patent: 5649154 (1997-07-01), Kumar et al.
patent: 5650955 (1997-07-01), Puar et al.
patent: 5687338 (1997-11-01), Boggs et al.
patent: 5703806 (1997-12-01), Puar et al.
patent: 5900011 (1999-05-01), Saulsbury et al.
patent: 5953738 (1999-09-01), Rao
patent: 6128702 (2000-10-01), Saulsbury et al.
patent: 6202143 (2001-03-01), Rim
patent: 6256256 (2001-07-01), Rao
patent: 6321318 (2001-11-01), Baltz et al.
patent: WO 00/33178 (2000-06-01), None
Mitsubishi Electric Corp; Product Specification for Single-Chip CMOS Microcomputer; May 1998.*
Numomura Y et al: “M32R/D-Integrating DRAM and Microprocessor” IEEE Inc. New York, US, vol. 17 No. 6, Nov. 1, 1997, pp. 40-48, XP000726003; ISSN: 0272-1732.
Kozyrakis C E et al: “Scalable Processors in the Billion-Transistor Era: IRAM” Computer, IEEE Computer Society, Long Beach., CA, US, US vol. 20, No. 0 Sep. 1, 1997, pp. 75-78, XP000730003; ISSN: 0018-9162.
Herrmann Klaus, Hilgenstock Joerg, Pirsch Peter: “Architecture of a Multiprocessor System with Embedded DRAM for Large Area Integration” Oct. 8, 1997, IEEE International Conference on Innovative Systems in Silicon, Piscataway, NJ, USA; XP002179990.
Aimoto, Yoshiharu et al.; “A.768GIPS 3.84GB/s 1 W Parallel Image-Processing RAM Integrating a 16 Mb DRAM and 128 Processors”; ISSCC96/Session 23 / DRAM / Paper SP23.3; 1996 IEEE International Solid-State Circuits Conference; pp. 372-373 and 476.
Bursky, Dave; “Combo RISC CPU and DRAM Solves Data Bandwidth Issues”; Electronic Design; Mar. 4, 1996; pp. 67-71.
Saulsbury, Ashley, et al., “Missing the Memory Wall: The Case for Processor/Memory Integration”; ACM; 1996; pp. 90-101.
Shimizu, Toro, et al.; “A Multimedia 32b RISC Microprocessor with 16 Mb DRAM”; ISSCC96/Session 13 / Microprocessors/Paper FP 13.4; 1996 IEEE Int
Nettleton Nyles
Parkin Michael
Saulsbury Ashley
Nguyen T. V.
Sun Microsystems Inc.
Townsend and Townsend / and Crew LLP
LandOfFree
VLIW computer processing architecture with on-chip dynamic RAM does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with VLIW computer processing architecture with on-chip dynamic RAM, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and VLIW computer processing architecture with on-chip dynamic RAM will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3141600