Electrical computers and digital processing systems: memory – Storage accessing and control – Access timing
Reexamination Certificate
2000-12-22
2004-02-24
Gossage, Glenn (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Access timing
C711S122000, C711S141000, C711S154000, C709S248000, C713S400000
Reexamination Certificate
active
06697925
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to data processing systems employing multiple instruction processors and more particularly relates to multiprocessor data processing systems employing multiple dayclocks in conjunction with multiple levels of cache memory.
2. Description of the Prior Art
It is known in the art that the use of multiple instruction processors operating out of common memory can produce problems associated with the processing of obsolete memory data by a first processor after that memory data has been updated by a second processor. The first attempts at solving this problem tended to use logic to lock processors out of memory spaces being updated. Though this is appropriate for rudimentary applications, as systems become more complex, the additional hardware and/or operating time required for the setting and releasing of locks can not be justified, except for security purposes. Furthermore, reliance on such locks directly prohibits certain types of applications such as parallel processing.
The use of hierarchical memory systems tends to further compound the problem of data obsolescence. U.S. Pat. No. 4,056,844 issued to Izumi shows a rather early approach to a solution. The system of Izumi utilizes a buffer memory dedicated to each of the processors in the system. Each processor accesses a buffer address array to determine if a particular data element is present in its buffer memory. An additional bit is added to the buffer address array to indicate invalidity of the corresponding data stored in the buffer memory. A set invalidity bit indicates that the main storage has been altered at that location since loading of the buffer memory. The validity bits are set in accordance with the memory store cycle of each processor.
U.S. Pat. No. 4,349,871 issued to Lary describes a bussed architecture having multiple processing elements, each having a dedicated cache memory. According to the Lary design, each processing unit manages its own cache by monitoring the memory bus. Any invalidation of locally stored data is tagged to prevent use of obsolete data. The overhead associated with this approach is partially mitigated by the use of special purpose hardware and through interleaving the validity determination with memory accesses within the pipeline. interleaving of invalidity determination is also employed in U.S. Pat. No. 4,525,777 issued to Webster et al.
Similar bussed approaches are shown in U.S. Pat. No. 4,843,542 issued to Dashiell et al, and in U.S. Pat. No. 4,755,930 issued to Wilson, Jr. et al. In employing each of these techniques, the individual processor has primary responsibility for monitoring the memory bus to maintain currency of its own cache data. U.S. Pat. No. 4,860,192 issued to Sachs et al, also employs a bussed architecture but partitions the local cache memory into instruction and operand modules.
U.S. Pat. No. 5,025,365 issued to Mathur et al, provides a much enhanced architecture for the basic bussed approach. In Mathur et al, as with the other bussed systems, each processing element has a dedicated cache resource. Similarly, the cache resource is responsible for monitoring the system bus for any collateral memory accesses which would invalidate local data. Mathur et al, provide a special snooping protocol which improves system throughput by updating local directories at times not necessarily coincident with cache accesses. Coherency is assured by the timing and protocol of the bus in conjunction with timing of the operation of the processing element.
An approach to the design of an integrated cache chip is shown in U.S. Pat. No. 5,025,366 issued to Baror. This device provides the cache memory and the control circuitry in a single package. The technique lends itself primarily to bussed architectures. U.S. Pat. No. 4,794,521 issued to Ziegler et al, shows a similar approach on a larger scale. The Ziegler et al, design permits an individual cache to interleave requests from multiple processors. This design resolves the data obsolescence issue by not dedicating cache memory to individual processors. Unfortunately, this provides a performance penalty in many applications because it tends to produce queuing of requests at a given cache module.
The use of a hierarchical memory system in a multiprocessor environment is also shown in U.S. Pat. No. 4,442,487 issued to Fletcher et al. In this approach, each processor has dedicated and shared caches at both the L1 or level closest to the processor and at the L2 or intermediate level. Memory is managed by permitting more than one processor to operate upon a single data block only when that data block is placed in shared cache. Data blocks in dedicated or private cache are essentially locked out until placed within a shared memory element. System level memory management is accomplished by a storage control element through which all requests to shared main memory (i.e. L3 level) are routed. An apparent improvement to this approach is shown in U.S. Pat. No. 4,807,110 issued to Pomerene et al. This improvement provides prefetching of data through the use of a shadow directory.
A further improvement to Fletcher et al, is seen in U.S. Pat. No. 5,023,776 issued to Gregor. In this system, performance can be enhanced through the use of store around L1 caches used along with special write buffers at the L2 intermediate level. This approach appears to require substantial additional hardware and entails yet more functions for the system storage controller.
Inherent in architectures which employ cache memory, is that the storage capacity is substantially less than the memory located at lower levels in the hierarchy. As a result, memory locations within the cache memory must often be cleared for use by other data quantities more recently needed by the instruction processor. For store-in cache memories, this means that those quantities modified by the instruction processor must first be rewritten to system memory before the corresponding location is available to store newly requested data. This “flushing” process tends to delay the availability of the newly requested data.
A problem very similar to the maintenance of cache memory coherency is the synchronization of dayclock information. In real time systems, an accurate dayclock is required to enable time tagging of data and events. Such time keeping is also necessary to provide certain priority determinations when these rest, at least in part, on relative timing. The use of a single clock by multiple processors tends to be slow and cumbersome. However, providing multiple dayclocks necessitates synchronization to ensure consistency to time tagging amongst processor.
SUMMARY OF THE INVENTION
The present invention overcomes the problems found in the prior art by providing a method of and apparatus for synchronizing multiple dayclocks within a multiple processor system having hierarchical memory. This synchronization process is basically provided by the cache memory coherency maintenance facilities.
The preferred mode of the present invention includes up to four main memory storage units. Each is coupled directly to each of up to four “pod”s. Each pod contains a level three cache memory coupled to each of the main memory storage units. Each pod may also accommodate up to two input/output modules.
Each pod may contain up to two sub-pods, wherein each sub-pod may contain up to two instruction processors. Each instruction processor has two separate level one cache memories (one for instructions and one for operands) coupled through a dedicated system controller, having a second level cache memory, to the level three cache memory of the pod.
Each instruction processor has a dedicated system controller associated therewith. A separate dayclock is located within each system controller.
Unlike many prior art systems, both level one and level two cache memories are dedicated to an instruction processor within the preferred mode of the present invention. The level one cache memories are of two types. Each instruction processor
Boone Lewis A.
Federici James L.
Malek Robert M.
Vartti Kelvin S.
Gossage Glenn
Johnson Charles A.
Nawrocki, Rooney & Sivertson P.A.
Peugh Brian R.
Starr Mark T.
LandOfFree
Use of a cache ownership mechanism to synchronize multiple... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Use of a cache ownership mechanism to synchronize multiple..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Use of a cache ownership mechanism to synchronize multiple... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3314477