Electrical computers and digital data processing systems: input/ – Intrasystem connection – Bus interface architecture
Reexamination Certificate
2000-08-31
2003-11-11
Vo, Tim (Department: 2181)
Electrical computers and digital data processing systems: input/
Intrasystem connection
Bus interface architecture
C710S100000, C710S104000, C710S300000, C710S310000, C711S141000
Reexamination Certificate
active
06647453
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to computer architectures and, more specifically, to distributed, shared memory multiprocessor computer systems.
2. Background Information
Distributed shared memory computer systems, such as symmetric multiprocessor (SMP) systems support high-performance application processing. Conventional SMP systems include a plurality of processors coupled together by a bus. One characteristic of SMP systems is that memory space is typically shared among all of the processors. That is, each processor accesses programs in the shared memory, and processors communicate with each other via that memory (e.g., through messages and status information left in shared address spaces). In some SMP systems, the processors may also be able to exchange signals directly. One or more operating systems are typically stored in the shared memory. These operating systems control the distribution of processes or threads among the various processors. The operating system kernels may execute on any processor, and may even execute in parallel. By allowing many different processors to execute different processes or threads simultaneously, the execution speed of a given application may be greatly increased.
FIG. 1
is a block diagram of a conventional SMP system
100
. System
100
includes a plurality of processors
102
a-e
, each connected to a system bus
104
. A memory
106
and an input/output (I/O) bridge
108
are also connected to the system bus
104
. The I/O bridge
108
is also coupled to one or more I/O busses
110
a-c
. The I/O bridge
108
basically provides a “bridging” function between the system bus
104
and the I/O busses
110
a-c
. Various I/O devices
112
, such as disk drives, data collection devices, keyboards, CD-ROM drives, etc., may be attached to the I/O busses
110
a-c
. Each processor
102
a-e
can access memory
106
and/or various input/output devices
112
via the system bus
104
. Each processor
102
a-e
has at least one level of cache memory
114
a-e
that is private to the respective processor
102
a-e.
The cache memories
114
a-e
typically contain an image of data from memory
106
that is being utilized by the respective processor
102
a-e
. Since the cache memories of two processors (e.g., caches
114
b
and
114
e
) may contain overlapping or identical images of data from main memory
106
, if one processor (e.g., processor
102
b
) were to alter the data in its cache (e.g., cache
114
b
), the data in the other cache (e.g., cache
114
e
) would become invalid or stale. To prevent the other processor (e.g., processor
102
e
) from acting on invalid or stale data, SMP systems, such as system
100
, typically include some type of cache coherency protocol.
In general, cache coherency protocols cause other processors to be notified when an update (e.g., a write) is about to take place at some processor's cache. Other processors, to the extent they also have copies of this same data in their caches, may then in validate their copies of the data. The write is typically broadcast to the processors which then update the copies of the data in their local caches. Protocols or algorithms, some of which may be relatively complex, are often used to determine which entries in a cache should be overwritten when more data than can be stored in the cache is received.
I/O bridge
108
may also include one or more cache memories (not shown) of its own. The bridge cache is used to store data received via system bus
104
from memory
106
and/or the processor caches
114
that is intended for one or more of the I/O devices
112
. That is, bridge
108
forwards the data from its cache onto one or more of the I/O busses
110
. Data may also be received by an I/O device
112
and stored at the bridge cache before being driven onto system bus
104
for receipt by a processor
102
or memory
106
. Generally, the data stored in the cache of I/O bridge
108
is not coherent with the system
110
. In small computer systems, it is reasonable for an I/O bridge not to maintain cache coherence for read transactions because those transactions (fetching data from the cache coherent domain) are implicitly ordered and the data is consumed immediately by the device. However, in large computer systems with distributed memory, I/O devices, such as devices
112
, are not guaranteed to receive coherent data.
U.S. Pat. No. 5,884,100 to Normoyle et al. discloses a single central processing unit (CPU) chip in which an I/O system is disposed on (i.e., built right onto) the core or package of the CPU chip. That is, Normoyle discloses an I/O system that is part of the CPU chipset. Because the I/O system in the Normoyle patent is located in such close proximity to the CPU, and there is only one CPU, the Normoyle patent is purportedly able to keep the I/O system coherent with the CPU.
In symmetrical multiprocessor computer systems, however, it would be difficult to incorporate the I/O system onto the processor chipset. For example, the Normoyle patent provides no suggestion as to how its I/O system might interface with other CPUs or with other I/O systems. Thus, a need exists for providing cache coherency in the I/O domain of a symmetrical multiprocessor system.
However, by imposing cache coherency on the I/O domain of a symmetrical multiprocessor computer system, other problems that could degrade system's performance may result. For example, some cache coherency protocols, if applied to the I/O bridge, may result in two or more I/O devices, who are competing for the same data, becoming “livelocked”. In other words, neither I/O device is able to access the data. As a result, both devices are “starved” of data and are unable to make any progress in their respective processes or application programs. Accordingly, a need exists, not just for providing cache coherency in the I/O domain, but for also ensuring continued, high-level operation of the symmetrical multiprocessor system.
SUMMARY OF THE INVENTION
Briefly, the invention relates to a system and method for avoiding “livelock” and “starvation” among two or more input/output (I/O) devices competing for the same data in a symmetrical multiprocessor (SMP) computer system. The SMP computer system includes a plurality of interconnected processors having corresponding caches, one or more memories that are shared by the processors, and a plurality of I/O bridges to which the I/O devices are coupled. Each I/O bridge includes one or more upstream buffers and one or more downstream buffers. An up engine is coupled to the upstream buffer and is controls the flow of information, including requests for data, from the I/O devices to the processors and shared memory. A down engine is coupled to the downstream buffer, and controls the flow of information from the processors and shared memory to the I/O devices. A cache coherency protocol is executed in the I/O bridge in order to keep the data in the downstream buffer coherent with the processor caches and shared memory. As part of the cache coherency protocol, the I/O bridge obtains “exclusive” (not shared) ownership of all data fetched from the processor caches and the shared memory, and invalidates and releases any data in the downstream buffer that is requested by a processor or by some other I/O bridge.
To prevent two I/O devices from becoming “livelocked” in response to competing requests for the same data, each I/O bridge further includes at least one non-coherent memory device which is also coupled to and thus under the control of the down engine. Before invalidating data requested by a competing device or entity, the down engine at the I/O bridge receiving the request-first copies that data to the bridge's non-coherent memory device. The down engine then takes the largest amount of the copied data that it “knows” to be coherent (despite the request for that data by a processor or other I/O bridge) and releases only that amount to the I/O device which originally requested the data from the bridge. In the illustrative embodiment, this “kn
Duncan Samuel H.
Ho Steven
Hewlett--Packard Development Company, L.P.
Vo Tim
LandOfFree
System and method for providing forward progress and... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for providing forward progress and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for providing forward progress and... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3145605