Computer architecture for shared memory access

Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S117000, C711S141000, C711S143000

Reexamination Certificate

active

06636950

ABSTRACT:

BACKGROUND
This invention relates to a computer architecture that includes a shared memory system.
Many current computer systems make use of hierarchical memory systems to improve memory access from one or more processors. In a common type of multiprocessor system, the processors are coupled to a hierarchical memory system made up of a shared memory system and a number of memory caches, each coupled between one of the processors and the shared memory system. The processors execute instructions, including memory access instructions such as “load” and “store,” such that from the point of view of each processor, a single shared address space is directly accessible to each processor, and changes made to the value stored at a particular address by one processor are “visible” to the other processor. Various techniques, generally referred to as cache coherency protocols, are used to maintain this type of shared behavior. For instance, if one processor updates a value for a particular address in its cache, caches associated with other processors that also have copies of that address are actively notified by the shared memory system and the notified caches remove or invalidate that address in their storage, thereby preventing the other processors from using out-of-date values. The shared memory system keeps a directory that identifies which caches have copies of each address and uses this directory to notify the appropriate caches of an update. In another approach, the caches share a common communication channel (e.g., a memory bus) over which they communicate with the shared memory system. When one cache updates the shared memory system, the other caches “snoop” on the common channel to determine whether they should invalidate any of their cached values.
In order to guarantee a desired ordering of updates to the shared memory system and thereby permit synchronization of programs executing on different processors, many processors use instructions, generally known as “fence” instructions, to delay execution of certain memory access instructions until other previous memory access instructions have completed. The PowerPC “Sync” instruction and the Sun SPARC “Membar” instruction are examples of fence instructions in current processors. These fences are very “course grain” in that they require all previous memory access instructions (or a class of all loads or all stores) to complete before a subsequent memory instruction is issued.
Many processor instruction sets also include a “prefetch” instruction that is used to reduce the latency of Load instructions that would have required a memory transfer between the shared memory system and a cache. The prefetch instruction initiates a transfer of data from the shared memory system to the processor's cache but the transfer does not have to complete before the instruction itself completes. A subsequent Load instruction then accesses the prefetched data, unless the data has been invalidated in the interim by another processor or the data have not yet been provided to the cache.
SUMMARY
As the number of processors grows in a multiple processor system, the resources required by current coherency protocols grow as well. For example, the bandwidth of a shared communication channel used for snooping must accommodate updates from all the processors. In approaches in which a shared memory system actively notifies caches of memory updates, the directory or other data structure used to determine which caches must be notified also must grow, as must the communication resources needed to carry the notifications. Furthermore, in part to maintain high performance, coherency protocols have become very complex. This complexity has made validation of the protocols difficult and design of compilers which generate code for execution in conjunction with these memory systems complicated.
In a general aspect, the invention is a computer architecture that includes a hierarchical memory system and one or more processors. The processors execute memory access instructions whose semantics are defined in terms of the hierarchical structure of the memory system. That is, rather than attempting to maintain the illusion that the memory system is shared by all processors such that changes made by one processor are immediately visible to other processors, the memory access instructions explicitly address access to a processor-specific memory, and data transfer between the processor-specific memory and the shared memory system. Various alternative embodiments of the memory system are compatible with these instructions. These alternative embodiments do not change the semantic meaning of a computer program which uses the memory access instructions, but allow different approaches to how and when data is actually passed from one processor to another. Certain embodiments of the shared memory system do not require a directory for notifying processor-specific memories of updates to the shared memory system.
In one aspect, in general, the invention is a computer system that includes a hierarchical memory system and a first memory access unit, for example, a functional unit of a computer processor that is used to execute memory access instructions. The memory access unit is coupled to the hierarchical memory system, for example over a bus or some other communication path over which memory access messages and responses are passed. The hierarchical memory system includes a first local storage, for example a data cache, and a main storage. The first memory access unit is capable of processing a number of different memory access instructions, including, for instance, instructions that transfer data to and from the memory system and instructions, instructions that guarantee that data transferred to the memory system is accessible to other processors, and instructions that access data previously written by other processor. The first memory access unit is, in particular, capable of processing the following instructions:
A first instruction, for example, a “store local” instruction, that specifies a first address and a first value. Processing this first instruction by the first memory access unit causes the first value to be stored at a location in the first local storage that is associated with the first address. For example, if the local storage is a cache memory, the processing of the first instruction causes the first value to be stored in the cache memory, but not necessarily to be stored in the main memory and accessible to other processors prior to the processing of the first instruction completing.
A second instruction, for example, a “commit” instruction, that specifies the first address. Processing of the second instruction by the first memory access unit after processing the first instruction is such that the first memory access unit completes processing of the second instruction after the first value is stored at a location in the main storage that is associated with the first address. For example, the processing of the second instruction may cause the value to be transferred to the main storage, or alternatively the transfer of the value may have already been.initiated prior to the processing of the second instruction, in which case the second instruction completes only after that transfer is complete.
Using these instructions, the memory access unit can transfer data to the local storage without necessarily waiting for the data, or some other form of notification, being propagated to other portions of the memory system. The memory access unit can also determine when the data has indeed been transferred to the main storage and made available to other processors coupled to the memory system, for example when that data is needed for coordinated operation with other processors.
The first memory access unit can also be capable of processing the following instructions:
A third instruction, for example, a “load local” instruction, that specifies the first address. Processing of the third instruction by the first memory access unit causes a value to be retrieved by the memory access un

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Computer architecture for shared memory access does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Computer architecture for shared memory access, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer architecture for shared memory access will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3131829

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.