Electrical computers and digital processing systems: virtual mac – Task management or control – Process scheduling
Reexamination Certificate
2000-01-10
2004-11-23
An, Meng-Al T. (Department: 2127)
Electrical computers and digital processing systems: virtual mac
Task management or control
Process scheduling
C718S100000, C718S104000, C718S106000, C710S200000, C711S147000, C711S150000
Reexamination Certificate
active
06823511
ABSTRACT:
TECHNICAL FIELD
This invention relates generally to process synchronization in multiprocessor systems. More particularly, this invention relates to a reader-writer lock and related method for multiprocessor systems having a group of processors (CPUs) with lower communication latencies than other processors in a system. Such systems include but are not limited to multiprocessor systems having a non-uniform memory access (NUMA) architecture.
BACKGROUND
Multiprocessor systems by definition contain multiple processors (also referred to herein as CPUs) that can execute multiple processes (or multiple threads within a single process) simultaneously, in a manner known as parallel computing. In general, multiprocessor systems execute multiple processes or threads faster than conventional uniprocessor systems, such as personal computers (PCs), that execute programs sequentially. The actual performance advantage is a function of a number of factors, including the degree to which parts of a multithreaded process and/or multiple distinct processes can be executed in parallel and the architecture of the particular multiprocessor system at hand. The degree to which processes can be executed in parallel depends, in part, on the extent to which they compete for exclusive access to shared memory resources.
Shared memory multiprocessor systems offer a common physical memory address space that all processors can access. Multiple processes therein (or multiple threads within a process) can communicate through shared variables in memory which allow the processes to read or write to the same memory location in the computer system. Message passing multiprocessor systems, in contrast to shared memory systems, have a separate memory space for each processor. They require processes to communicate through explicit messages to each other.
The architecture of shared memory multiprocessor systems may be classified by how their memory is physically organized. In distributed shared memory (DSM) machines, the memory is divided into modules physically placed near one or more processors, typically on a processor node. Although all of the memory modules are globally accessible, a processor can access local memory on its node faster than remote memory on other nodes. Because the memory access time differs based on memory location, such systems are also called non-uniform memory access (NUMA) machines. In centralized shared memory machines, on the other hand, the memory is physically in one location. Centralized shared memory computers are called uniform memory access (UMA) machines because the memory is equidistant in time from each of the processors. Both forms of memory organization typically use high-speed cache in conjunction with main memory to reduce execution time.
The use of NUMA architecture to increase performance is not restricted to NUMA machines. A subset of processors in an UMA machine may share a cache. In such an arrangement, even though the memory is equidistant from all processors, data can circulate among the cache-sharing processors faster (i.e., with lower latency) than among the other processors in the machine. Algorithms that enhance the performance of NUMA machines can thus be applied to any multiprocessor system that has a subset of processors with lower latencies. These include not only the noted NUMA and shared-cache machines, but also machines where multiple processors share a set of bus-interface logic as well as machines with interconnects that “fan out” (typically in hierarchical fashion) to the processors.
A significant issue in the design of multiprocessor systems is process synchronization. As noted earlier, the degree to which processes can be executed in parallel depends in part on the extent to which they compete for exclusive access to shared memory resources. For example, if two processes A and B are executing in parallel, process B might have to wait for process A to write a value to a buffer before process B can access it. Otherwise, a race condition could occur, where process B might access the buffer before process A had a chance to write the value to the buffer.
To illustrate further, suppose two processors execute processes having instructions to add one to a counter. Specifically, the instructions could be the following:
1. Read the counter into a register.
2. Add one to the register.
3. Write the register to the counter.
If the two processors were to execute these instructions in parallel, the first processor might read the counter (e.g., “5”) and add one to it (resulting in “6”). Since the second processor is executing in parallel with the first processor, the second processor might also read the counter (still “5”) and add one to it (resulting in “6”). One of the processors would then write its register (containing a “6”) to the counter, and the other processor would do the same. Although two processors have executed instructions to add one to the counter, the counter is only one greater than its original value.
To avoid this incorrect result, process synchronization mechanisms are provided to control the order of process execution. These mechanisms include mutual exclusion locks (mutex locks), condition variables, counting semaphores, and reader-writer locks. A mutual exclusion lock allows only the processor holding the lock to execute an associated action. When a processor requests a mutual exclusion lock, it is granted to that processor exclusively. Other processors desiring the lock must wait until the processor with the lock releases it. To solve the add-one-to-a-counter scenario described above, for example, both the first and the second processors would request the mutual exclusion lock before executing further. Whichever processor first acquires the lock then reads the counter, increments the register, and writes to the counter before releasing the lock. The other processor must wait until the first processor finishes and releases the lock; it then acquires the lock, performs its operations on the counter, and releases the lock. In this way, the lock guarantees the counter is incremented twice if the instructions are run twice, even if processors running in parallel execute them.
For processes to be synchronized, instructions requiring exclusive access can be grouped into a critical section and associated with a lock. When a process is executing instructions in its critical section, a mutual exclusion lock guarantees no other processes are executing the same instructions. This is important where processes are attempting to change data (as described in the example above). Such a lock has the drawback, however, in that it prohibits multiple processes from simultaneously executing instructions that only allow the processes to read data. A reader-writer lock, in contrast, allows multiple reading processes (“readers”) to access simultaneously a shared resource such as a database, while a writing process (“writer”) must have exclusive access to the database before performing any updates for consistency. A practical example of a situation appropriate for a reader-writer lock is a TCP/IP routing structure with many readers and an occasional update of the information. Early implementations of reader-writer locks are described by Courtois, et al., in “Concurrent Control with ‘Readers’ and ‘Writers’,” Communications of the ACM, 14(10):667-668 (1971). More recent implementations are described by Mellor-Crummey and Scott (MCS) in “Scalable Reader-Writer Synchronization for Shared-Memory Multiprocessors,” Proceedings of the Third ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pages 106-113 (1991) and by Hseih and Weihl in “Scalable Reader-Writer Locks for Parallel Systems,” Technical Report MIT/LCS/TR-521 (November 1991).
The basic mechanics and structure of reader-writer locks are well known. In a typical lock, multiple readers may acquire the lock, but only if there are no active writers. Conversely, a writer may acquire the lock only if there are no active readers or another writer. When a reader releases the lock, it takes no action unles
Kingsbury Brent
McKenney Paul E.
An Meng-Al T.
Lieberman & Brandsdorfer LLC
Syed Ali
LandOfFree
Reader-writer lock for multiprocessor systems does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Reader-writer lock for multiprocessor systems, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reader-writer lock for multiprocessor systems will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3327724