Coarse grained determination of data dependence between...

Electrical computers and digital processing systems: multicomput – Multicomputer data transferring via shared memory – Partitioned shared memory

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S150000, C711S173000

Reexamination Certificate

active

06665708

ABSTRACT:

BACKGROUND
The present invention relates to information processing system organizations, more particularly to the parallel execution of computer programs or jobs, and even more particularly to techniques for enabling the speculative execution of concurrent jobs in an information processing system.
The traditional electronic computer has a single processing unit, and operates in accordance with a model whereby program instructions are retrieved (“fetched”) one-by-one from an addressable memory, and then executed. Instructions that are to be executed in sequence are typically stored at sequential address locations within the memory. Exceptions to this sequential storage of instructions often occur, as for example when execution of a program segment is made conditional on some condition to be tested (e.g., whether two values are equal to one another), or when execution of the present program segment is to be interrupted by execution of another program segment (e.g., in response to a subroutine call or an interrupt). In such cases, program execution may take what is called a “branch” or “jump” to another location, whereby the fetching of instructions continues not with the next sequentially stored instruction, but with one stored at some other location within the memory.
Regardless of how the instructions are stored, it is the expectation of the programmer that the instructions that constitute a particular job will be executed in a particular order. A consequence of this expectation is that variables will be operated upon (e.g., modified or tested) in a particular sequence. Failure to comply with this expectation can result in a job that generates error-laden results.
It continues to be a goal of computer architects to design systems that can complete more work in less time. One approach for doing this has concentrated on making processing elements that are capable of operating faster. This approach has no impact on the programmer's expectation of sequential program execution.
Another approach to improving processing speed has been to devise processors that are capable of operating concurrently. For example, in a so-called “super-scalar” processor, the elements within a single processor are organized in such a way so as to permit several instructions to be performed concurrently. Another way to provide concurrent execution of instructions (so called “instruction level parallel” (ILP) processing) is to provide multiple processing units, each attached to a shared memory, and to allocate individual instructions of a single program to be run on different ones of the processing units.
In order to ensure that the programmer's expectation of sequential program execution is carried out, these architectures need to deal with two types of dependencies: “control dependency” and “data dependency”. Control dependency refers to the dependency of instructions to be executed only as a function of whether a conditional branch or jump has been taken in a preceding instruction. Data dependency is a dependency of instructions that use data that is created or changed by earlier instructions. The later-specified instructions may correctly execute only if the earlier instructions using the same data do not change the common data or have completed the change of the common data.
Rather than holding up the execution of an instruction whose execution is in some way dependent on the results generated by another instruction, these architectures often turn to the speculative execution of an instruction. That is, an instruction is executed as if there were no control or data dependency. The results of such a speculatively executed instructions must be undone in the event that it is later discovered that the originally planned sequential execution of the instructions would have achieved different results. U.S. Pat. No. 5,781,752 describes an ILP architecture that employs a table based data speculation circuit.
In yet another approach to increasing overall processing speed, some computer systems achieve high processing performance through a computer architecture known as Symmetric Multi Processing (SMP). In contrast to the fine-grained parallelism achieved by the above-described ILP architectures, the SMP architecture exploits coarse-grained parallelism that is either explicitly specified in programs designed in accordance with concurrent programming principles, or extracted from programs designed for sequential execution on a single-processor system during compilation.
Coarse-grained parallelism means task-level parallelism as opposed to instruction-level parallelism (although the two types of parallelism are not mutually exclusive—different tasks could be assigned to separate processors which each then employ instruction-level parallelism to carry out their respective task). In an SMP architecture, each one of several rather self-contained and complex computing tasks is carried out on a respective one of several processors. These tasks are mutually concurrent processes, threads or other similar constructs well-known in the information processing arts.
In another computer architecture having multiple processors, further parallelism is extracted during program execution by creating different threads from a single program, and assigning several tasks to different processors for concurrent execution. Because they derive from the same program, these threads may have dependencies similar to those described above with respect to instruction level parallelism. In particular, it is important that the two or more threads maintain data consistency—that is, that a thread intended for later execution not use a data variable that has yet to be updated by a thread intended for earlier execution, and that the thread intended for later execution not modify a data variable that will subsequently be accessed by a thread intended for earlier execution. The occurrence of either of these events is called a “collision”.
Because of the possibility of collisions, it is common to insert locks (semaphores) into the code in order to maintain data consistency. This prevents any collisions from happening. However, algorithms that extract parallelism and insert locks for this purpose must employ a very conservative strategy because they must guarantee that a collision never occurs. This has the drawback of limiting the amount of parallelism that can be extracted.
As another solution to the problem presented when threads that share a data memory space are concurrently executed, one may employ speculative execution. In speculative execution, a collision between threads is detected and the erroneous results of executed threads are undone or purged and threads are restarted in such a way as to guarantee progress (i.e., to guarantee that at least one of the restarted jobs will complete without a collision).
In one architecture, one of a number of parallel threads is designated as a “committed thread”. All other concurrently executed threads are referred to as “speculative threads”. The committed thread is a thread that would be executed earliest if execution were sequential. The committed thread stores its state directly in a main memory. (As used herein, the term “state” refers to the execution results of a thread or job, such as memory updates, heap, stack, signaling and so forth.) Speculative threads however temporarily store their states not in the shared memory, but in a memory (or memory area) distinct from the shared memory.
Since the committed thread is the thread intended for the earliest execution if execution were sequential, and since the results of the execution of the speculative threads do not affect the shared memory, there is no question concerning accuracy of the result of the committed thread. When execution of the committed thread is complete, it is simply retired. No particular action is taken with regard to the memory because an accurate state of the committed thread is already part of the shared memory.
After retirement of the committed thread, another thread is designated as a new committed thread. Designating a

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Coarse grained determination of data dependence between... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Coarse grained determination of data dependence between..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Coarse grained determination of data dependence between... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3103092

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.