Selecting a cache design for a computer system using a model...

Data processing: measuring – calibrating – or testing – Measurement system – Performance or efficiency evaluation

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S241000, C714S006130, C717S152000

Reexamination Certificate

active

06542855

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to computers and, more particularly, to a method for selecting a cache design for a computer system. A major objective of the invention is to provide a method for quantitatively estimating the performance of alternative cache designs for incorporation in a given computer system.
Much of modern progress is associated with the proliferation of computers. While much attention is focussed on general-purpose computers, application-specific computers are even more prevalent. (Application-specific computers typically incorporate one or more customed-designed integrated circuits—referred to as “application-specific integrated circuits” or “ASICs”) Such application-specific computers can be found in new device categories, such as video games, and in advanced versions of old device categories, such as televisions.
A typical computer includes a processor and main memory. The processor executes program instructions, many of which involve the processing of data. Instructions are read from main memory, and data is read from and written to main memory. Advancing technology has provided faster processors and faster memories. As fast as memories have become, they remain a computational bottleneck; processors often have to idle while requests are filled from main memory.
Caches are often employed to reduce this idle time. Caches intercept requests to main memory and attempt to fulfill those requests using memory dedicated to the cache. To be effective, caches must be able to respond much faster than main memory; to achieve the required speed, caches tend to have far less capacity than does main memory. Due to their smaller capacity, caches can normally hold only a fraction of the data and instructions stored in main memory. An effective cache must employ a strategy that provides that the probability of a request for main-memory locations stored in the cache is much greater than the probability of a request for main-memory locations not stored in the cache.
There are many types of computer systems that use caches. A single pedagogical example is presented at this point to illustrate some of the issues regarding selection of a cache design. The application is a “set-top” box designed to process digital television signals in accordance with inputs received from the signal itself, from panel controls, and from remote controls over a digital infrared link. The set top box includes a 100 MHz 32-bit processor. This processor accesses instructions and data in 32-bit words. These words are arranged in 2
20
addressable 32-bit word locations of main-memory. Program instructions are loaded into main memory from. flash memory automatically when power is turned on. The processor asserts 30-bit word addresses; obviously, only a small fraction of these correspond to physical main memory locations.
A single cache design can involve one or more caches. There are level-1 and level-2 caches. In a Harvard architecture, there can be separate caches for data and for instructions. In addition, there can be a write buffer, which is typically a cache used to speed up write operations, especially, in a write-through mode. Also, the memory management units for many systems can include a translations-look-aside buffer (TLB), which is typically a fully associative cache.
In the pedagogical example, the cache is an integrated data/instruction cache with an associated write buffer. The main cache is a 4-way set associative cache with 2
10
addressable 32-bit word locations. These are arranged in four sets. Each set has 2
6
line locations, each with a respective 6-bit index. Each line location includes four word locations.
When the processor requests a read from a main-memory address, the cache checks its own memory to determine if there is a copy of that main memory location in the cache. If the address is not represented in the cache, a cache “miss” occurs. In the event of a miss, the cache fetches the requested contents from main memory. However, it is not just the requested word that is fetched, but an entire four-word line (having a line address constituted by the most significant 28 bits of the word address).
This fetched line is stored in a line location of the cache. The line must be stored at a cache line location having an index that matches the six least significant bits of the address of the fetched line. There is exactly one such location in each of the four cache sets; thus, there are four possible storage locations for the fetched line. A location without valid contents is preferred for storing the fetched line over a location with valid data. A location with less recently used contents is preferred to one with more recently used data. In the event of ties, the sets are assigned an implicit order so that the set with the lowest implicit order is selected for storing the fetched line.
The cache includes a write buffer that is used to pipeline write operations to speed up write operations in write-through mode. In write-though mode processor writes are written directly to main memory. The write buffer is one-word (32 bits) wide, and four words deep. Thus, the processor can issue four write requests and then attend to other tasks while the cache fulfills the requests in the background.
The question then arises: “Is this cache design optimal for the incorporating system?” Would a larger cache provide a big enough performance advantage to justify the additional cost (financial, speed, complexity, chip space, etc.)? Would a smaller cache provide almost the same performance at a significantly lower cost? Would the cache be more effective in arranged as a two-way set associative cache, or possibly as an eight-way set-associative cache? Should the line length be increased to eight words or even to sixteen words. Should the write buffer be shallower or deeper? Should the write buffer have a different width? (Probably not in this case; but write buffer width is an issue in systems where the processor asserts requests with different widths.)
In the event of a read miss, there are alternative policies for determining which set is to store a fetched line. Also, there are strategies that involving fetching lines even when there is no miss because a request for an address not represented in the cache is anticipated. In the event of a write hit, should the data written to cache be written immediately back to main memory, or should the write-back wait until the corresponding cache location is about to be overwritten. In the event of a write miss, should the data just be written to main memory and the cache left unchanged, or should the location written to in main memory be fetched so that it is now represented in the cache.
The rewards for cache optimization can be significant. Cache optimization, especially in application-specific computers where one program is run repeatedly, can result in significant performance enhancements. Achieving such performance enhancements by optimizing cache design as opposed to increasing processor speeds can be very cost effective. Increased processor speeds can require higher cost processors, increased power requirements, and increased problems with heat dissipation. In contrast, some cache optimizations, such as those involving rearranging a fixed cache memory size, are virtually cost free (on a post set-up per unit basis).
The challenge is to find a method of optimizing a cache design that is both effective and cost-effective. While a selection can be made as an “educated guess”, there is little assurance that the selected design is actually optimal. In competitive applications, some sort of quantitative comparative evaluation of alternative cache designs is called for.
In a multiple-prototype approach, multiple prototype systems with different cache designs are built and their performances are compared under test conditions that are essentially the same as the intended operating conditions. This multiple-prototype approach provides a very accurate comparative evaluation of the tested alternatives. H

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Selecting a cache design for a computer system using a model... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Selecting a cache design for a computer system using a model..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Selecting a cache design for a computer system using a model... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3043388

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.