Method and apparatus for resolving probes in multi-processor...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S146000, C711S167000, C711S213000

Reexamination Certificate

active

06295583

ABSTRACT:

TECHNICAL FIELD
The present invention relates generally to computer processor technology, and more particularly to memory subsystems for a multiprocessor system.
BACKGROUND ART
One popular multiprocessor computer architecture is formed by coupling one or more processors to a shared main memory storing data, with each processor typically having a local cache to store its own private copy of a subset of the data from the main memory.
In the above architecture, a separate memory control chip connecting the processors to the main memory manages the operations necessary to access memory from any one of the processor caches and the main memory. It is typically the responsibility of the memory control chip to maintain a coherent view of the memory by checking an address reference generated by a processor. To perform this function, the memory control chip issues a probe reference to the other processor caches to see if a copy of the data exists in any of these other caches.
Each processor of the multiprocessor system must be able to service probe references to its cache as well as its own internally generated references to the cache. From the processor's point of view, these probe references consume cache bandwidth which could have been used for the processor's internal references. The impact of this degradation of bandwidth may affect the performance of the system.
In the prior art, one solution to minimize the impact of this degradation has been to maintain an external duplicate copy of the tags of the processor cache. This way, the probe request can reference the address of the tags to determine whether a probe response is a hit or a miss. Only if the probe response results in a cache hit, is the probe response sent to the data memory portion of the cache to access the data. Since probe responses typically result in cache misses, the external tags improve the performance of the system.
However, a multiprocessor system with duplicate external tags has some disadvantages. The system must provide the external tags for each processor along with the associated additional logic. In addition, since the external tags must maintain coherence with the processor's cache, logic must be provided which updates the state of the external tags to reflect any changes to the cache. This additional computation and bandwidth requirement leads to degradation in system performance.
Therefore, a technique is desired which resolves probe references in multiprocessor systems without using external duplicate tags.
SUMMARY DISCLOSURE OF THE INVENTION
The present invention overcomes the foregoing and other problems with a computing apparatus and method for resolving probes in a multiprocessor system without using external duplicate tags for probe filtering.
The computing apparatus of the present invention includes a clock, a cache, an input stream, a selector, and a multiplexer. The cache includes a tag structure and a data structure which both produce data in response to a probe. Preferably, the tag structure is implemented with static random access memory and the data structure is implemented with static random access memory capable of transferring data in a burst mode.
The tag structure in response to the probe transfers tag information in a clock cycle. The tag information includes information on whether the probe resulted in a cache hit or a cache miss. The data structure in response to the probe transfers data during multiple clock cycles of the clock.
An input stream accepts probes directed to the cache. The selector then designates each one of the plurality of probes in the input stream to be either a full probe or a tag-only probe. The multiplexer then accesses the data structure with one of the probes designated a full probe to transfer data during the multiple clocks cycles, and the multiplexer further accesses the tag structure with one or more of the probes designated tag-only probes during the multiple clock cycles. Each one of the designated tag-only probes accesses the tag structure to transfer tag information during a respective one of the multiple clock cycles.
In another aspect of the present invention, a processor is configured to transmit a full probe to the cache to transfer data from the stored data of the cache. The data corresponding to the full probe is transferable during a time period, which as discussed above could be multiple clock cycles. A tag-only probe is also transmitted to the cache during the same time period to determine if the data corresponding to the tag-only probe is part of the data stored in the cache.
In a further aspect of the present invention, a probe from the input stream accesses the cache in two stages. In the first stage, the selector designates the probe to be a tag-only probe and the multiplexer accesses the tag structure with the probe. If the probe returns tag information indicating a cache hit, the selector, in the second stage, designates the probe to be a full probe. The multiplexer then accesses the data structure with the probe. If the probe returns tag information indicating a cache miss the probe does not proceed to the second stage.
Another aspect of the present invention includes a probe queue for storing probes. The selector designates the probe from the input stream in two stages. In the first stage the selector designates the probe from the input stream to be a tag-only probe. If the probe in response to an access to the tag structure returns tag information indicating a cache hit, the probe is put on a probe queue. In the second stage the selector further designates a probe from the probe queue to be a full probe so that the multiplexer accesses the data structure with the probe.
Preferably, the cache in response to the full probe transfers tag information from the teg-structure during the first clock cycle of the multiple cycles and transfers the data from the data-structure during the multiple cycles.
Advantageously, the selector designates one probe in the input stream to be a full probe and three probes in the input stream to be tag-only probes. The multiplexer is configured to access the data structure corresponding to a full probe to transfer data during four clocks cycles. The multiplexer is further configured to access the tag structure in each clock cycle of the multiple clock cycles. In this regard, a respective one the tag-only probes is used to access the tag structure during three of the four clock cycles.
A tag bus may be provided to receive a tag stream of tag information from the tag structure in response to the corresponding plurality of probes received from the input stream. A probe history counter has values 0 through 3. The probe history counter is set to the value of 3 upon detecting a cache hit from the tag stream. The probe history counter is decremented by 1 upon detecting a miss from the tag stream and if a miss is detected while the probe history counter has value 0 then the probe history counter remains at 0. The selector then selects a probe to be a tag-only probe if the probe history counter is 0 and a full probe if the probe history counter is other than 0.
In accordance with other aspects of the present invention, a type unit is configured to determine a probe type for one of the probes in the input stream. The probe type determination may be based on characteristics of the probe. The selector is configured to determine whether to designate the probe as either a full probe or a tag-only probe based on the probe type determination.
The type unit may, if desired, he configured to determine if a probe type for one of the probes in the input stream is an I/O DMA probe. If so, the selector designates an I/O DMA probe to be a full probe.
A further feature includes accessing a cache with a full probe to transfer first data corresponding to the full probe from the cache during a time period and accessing the same cache with a tag-only probe during the same time period to determine during that time period if data corresponding to the tag-only probe is stored in the cache.
A multiprocessor system, in accor

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for resolving probes in multi-processor... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for resolving probes in multi-processor..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for resolving probes in multi-processor... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2520300

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.