Block data mover adapted to contain faults in a partitioned...

Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S138000, C711S162000, C714S006130

Reexamination Certificate

active

06826653

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to multiprocessor computer architectures and, more specifically, to the sharing or exchanging of information among partitions of a multiprocessor computer system.
2. Background Information
Symmetrical multiprocessor (SMP) computer systems support high performance application processing. Conventional SMP systems include a plurality of interconnected nodes. Each node typically includes one or more processors as well as a portion of system memory. The nodes may be coupled together by a bus or by some other data transfer mechanism. One characteristic of a SMP computer system is that all or substantially all of the system's memory space is shared among all nodes. That is, the processors of one node can access programs and data stored in the memory portion of another node. The processors of different nodes can also use system memory to communicate with each other by leaving messages and status information in shared memory space.
When a processor accesses (loads or stores to) a shared memory block from its own home node, the reference is referred to as a “local” memory reference. When the reference is to a memory block from a node other than the requesting processor's own home node, the reference is referred to as a “remote” memory reference. Because the latency of a local memory access differs from that of a remote memory accesses, the SMP system is said to have a Non-Uniform Memory Access (NUMA) architecture. Furthermore, if the memory blocks of the memory system are maintained in a coherent state, the system is called a cache coherent, NUMA architecture.
Partitions
The nodes or processors of a SMP computer system can also be divided among a plurality of partitions, increasing the operating flexibility of the SMP system.
FIG. 1
, for example, is a schematic, block diagram of an SMP computer system
100
comprising a plurality of interconnected nodes
102
. Each node
102
, moreover, includes a processor unit (P)
104
and a corresponding memory unit (MEM)
106
. The nodes
102
have been divided into a plurality of, e.g., four, partitions
108
a-d
, each comprising four nodes
102
. A separate operating system or a separate instance of the same operating system runs on each partition
108
a-d
. In a partitioned system it is often desirable to permit the processors
104
located in different partitions, e.g., partitions
108
a
and
108
d
, to exchange information, e.g., to communicate with each other. To this end, a portion of memory
106
at one or more nodes
102
, such as memory portions
110
at each node
102
, may be designated as global shared memory. Information or data stored at a global shared memory portion
110
of a first partition, e.g., partition
108
a
, may be accessed by the processors
104
located within a second partition, e.g., partition
108
d.
Although the use of global shared memory in a partitioned computer system allows the processors to share information across partition boundaries, it can result in errors or faults occurring in one partition causing errors or faults in other partitions. For example, in a cache coherent system, the state, e.g., the ownership, of memory blocks changes in response to reads or writes to those memory blocks. Two processors each located in a different partition and thus each running a different operating system may nonetheless share ownership of a memory block from some portion of global shared memory. A fault or failure in one partition that effects the shared memory block may cause a corresponding fault or failure to occur in the other partition.
To prevent such faults from crossing partition boundaries, the global shared memory can be made non-coherent. However, this approach may result in a partition obtaining stale information from the global shared memory. Specifically, the processor of a first partition may obtain a copy of a memory block from some portion of global shared memory before that memory block has been updated by some other processor. Use of such stale information within the first partition can introduce errors. Another approach to prevent faults from crossing partition boundaries is to move data between partitions through one or more input/output (I/O) devices. With this approach, data from a first partition is read from system memory by an I/O device within the first partition. The I/O device then transfers that data to an I/O device coupled to a second partition, thereby making the data available to the processors of the second partition. This approach also suffers from one or more drawbacks. In particular, the busses coupled to the I/O devices nearly always run at a fraction of the speed of the processor or memory busses. Accordingly, transferring data through multiple I/O devices takes substantial time and may introduce significant latencies.
Accordingly, a need exists for a system that efficiently transfers information between the partitions of a multiprocessor computer system that nonetheless prevents faults in one partition from affecting other partitions.
SUMMARY OF THE INVENTION
Briefly, the invention relates to a system and method for moving information between cache coherent memory subsystems of a partitioned multiprocessor computer system that prevents faults in one partition from affecting other partitions. The multiprocessor computer system includes a plurality of processors, memory subsystems and input/output (I/O) subsystems that can be segregated into a plurality of partitions. Each processor may have one or more processor caches for storing information, and each I/O subsystem includes at least one I/O bridge that interfaces between one or more I/O devices and the multiprocessor system. To maintain the coherence of information stored at the memory subsystems and the processor caches, the multiprocessor system may employ a directory based cache coherency protocol. According to the present invention, the I/O bridge has a data mover configured to retrieve information from a “source” partition and store it within the cache coherent system of its own “destination” partition.
Specifically, when an initiating processor in the source partition wishes to make information, e.g., one or more memory blocks, from a region of global shared memory available to a target processor of a destination partition, the initiating processor preferably issues a write transaction to its I/O bridge. The I/O bridge then notifies the target processor that information in the source partition's region of global shared memory is ready for copying, preferably by sending the target processor a Message Signaled Interrupt (MSI) containing an encoded message from the initiating processor. The target processor then configures or sets up the data mover in its I/O bridge to perform the transfer. In particular, the target processor provides the data mover with the memory address of the information in the source partition's global shared memory. The target processor also provides the data mover with the memory address within the destination partition to which the information is to be stored. Once the setup phase is complete, the target processor issues a start command to the data mover. In response, the data mover issues a request to the source partition for a non-coherent copy of the specified information. The home memory subsystem of the source partition preferably responds to the request by sending an “valid”, but non-coherent copy of the specified information, e.g., a “snapshot” of the information as of the time of the request, to the data mover in the destination partition. By requesting a non-coherent copy of the information, the data mover in the destination partition does not cause a change of ownership of the respective information to be recorded at the source partition.
The data mover in the destination partition also requests exclusive ownership over the memory block(s) within the destination partition to which the transferred information is to be written. Upon obtaining exclusive ownership, the data mover writes the inform

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Block data mover adapted to contain faults in a partitioned... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Block data mover adapted to contain faults in a partitioned..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Block data mover adapted to contain faults in a partitioned... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3309794

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.