Electrical computers and digital processing systems: multicomput – Computer-to-computer data routing – Least weight routing
Reexamination Certificate
1997-06-30
2002-05-14
Courtenay, III, St. John (Department: 2151)
Electrical computers and digital processing systems: multicomput
Computer-to-computer data routing
Least weight routing
Reexamination Certificate
active
06389446
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates to a multi-processor system executing a plurality of threads simultaneously and an execution method therefor.
Many multi-thread execution methods have been proposed for exploiting a parallelism as a program execution method, in which a program is divided into several instruction streams called “threads”.
According to the aforementioned multi-thread execution method, a thread is generated by “a fork operation”. The thread which executes the fork operation is referred to as “a parent thread”, and a newly generated thread is referred to as “a child thread”. The thread is terminated after it completes a defined operation. That is, lots of threads are forked and terminated in repeated manner in a program running process.
A thread is assigned to a thread processor. In a system with several physical thread processors, several threads can be executed at the same time, and thus the execution performance is enhanced relative to sequential processing. Moreover, by assigning several threads to each thread processor, it is possible to force a currently running thread into standby state and start another thread, in cases of a synchronization failure, a resource conflict, or a cash miss-hit occurs. Thus, possible delays caused by those events can be apparently hidden behind, and efficiency of resource uses can be enhanced.
The following typical references describe a method of multi-thread execution:
1. R. S. Nikhil et al: “A Multithreaded Massively Parallel Architecture” (Proceedings of the 19th Annual International Symposium on Computer Architecture, pp.156-167, May 1992); and
2. D. E. Culler et al: “Fine-grain Parallelism with Minimal Hardware Support: A Compiler-controlled Threaded Abstract Machine” (Proceeding of Fourth International Conference on Architectural Support for Programming Languages and Operating System, pp.164-175, April 1991).
However, when the frequency of the operation of thread generation/termination and change/restoration into/from a standby-state is higher than that of the other processes, the efficiency of entire system depends on that in thread scheduling of thread generation/termination, change/restoration into/from a standby-state and so on. Therefore, it is important for efficient parallel processing to decrease the overhead relevant to the thread scheduling. In a program with less parallelism, processing amount for one thread may be small. In this case, thread execution cost and thread scheduling cost become severe.
On the other hand, thread executing methods, in which a fine grain of thread is dealt small processing amount, and apparatus thereof have been proposed. For example, one of the parallel processors using fine grains of thread has been introduced in ‘Gurinder S. Sohi et al: “Multiscalar Processor” (The 22nd International Symposium on Computer Architecture, IEEE Computer Society Press, 1995, pp.414-425)’. In the multiscalar processor given in the reference, one program is divided into tasks each as a set of elementary blocks, and a processor, which has capability of parallel-processing, processes them.
FIG. 23
gives a configuration example of the multiscalar processor. The multiscalar processor has sequencer
73
, processing unit
74
, arbitration network
75
and data banks
76
. Several processing units
74
are included in the system. Each of the processing units
74
has command cache
77
, execution unit
78
and register file
79
. Several data banks
76
exist, according to each of processing units
74
. Each of data banks
76
is made up of ARB (Address Resolution Buffer)
80
, data cache
81
and so on. The parallel operation control of several tasks is dynamically or statically made by sequencer
73
using task descriptors, which include control flow information, and each processing unit
74
is assigned to a task. When sequencer
73
assigns tasks to processing units
74
, it examines the task descriptors so that next task will be assigned.
Further, a similar method of thread execution is given in the following reference: ‘Pradeep K. Dubey et al: “Single Speculative Multi-threading (SPSM) Architecture: Compiler-assisted Fine-Grained Multi-threading” (Parallel Architecture and Compilation Techniques, IFIP 1995).
FIG. 24
shows a general idea on thread execution in SPSM architecture. With reference to
FIG. 24
, thread generation command
84
and thread queuing command
85
are both embedded in a single program
82
. When thread generation command
84
is executed, future thread
83
, which will execute in a forestalled manner, is generated. Future thread
83
executes until thread queuing command
85
, and queues until an original process execution catches up with the future thread execution, and then merges the result from future thread execution with that from the original process execution. Future thread
83
, which executes in parallel to the original process execution, is generated for the speed-up of the program execution.
However, in the aforementioned thread execution method, when a fork command is executed or a thread execution is started, and when no thread processor in standby-state exists, a thread can not be generated, and proper program execution may not be encountered. Moreover, even though thread generation is stopped until a thread processor becomes ready, new thread generations continue to be made from running threads. Thus, there are shortcomings such as an increase in the number of threads. Furthermore, information on corresponding relation among threads needs to be stored, and thread management must be performed in one place in a parallel system. Thus, there are shortcomings in that the number of thread processors is limited by constraint in the thread manager, and it is difficult to develop the thread manager by hardware because the thread management method itself is complex.
SUMMARY OF THE INVENTION
In view of the foregoing problem of the conventional system, an object of the present invention is to provide a thread execution method which lowers thread scheduling cost.
In a multi-processor system according to a first aspect of the present invention, the multi-processor system executes a plurality of threads of instruction streams. The threads are provided by dividing a program into the instruction streams. Each of the threads is controlled to generate at most one thread.
REFERENCES:
patent: 5469570 (1995-11-01), Shibata
patent: 5621910 (1997-04-01), Nagatmatsu
patent: 5678016 (1997-10-01), Eisn et al.
patent: 5729757 (1998-03-01), Shekls
patent: 5787272 (1998-07-01), Gupta et al.
patent: 5812811 (1998-09-01), Dubey et al.
patent: 5867704 (1999-02-01), Tanaka et al.
patent: 08055092 (1996-02-01), None
Nikhil et al, “A Multithreaded Massively Parallel Architecture”, Proc. of 19th Ann. Intl. Symp. on Computer Architecture, pp. 156-167, May 1992.
Culler et al, “Fine-grain Parallelism with Minimal Hardware Support: A Compiler-controlled Threaded Abstract Machine”, Proc. of 4th Intl. Conf. on Architectural Sup. for Prog. Langs. & Op. Sys., pp. 164-175, Apr. 1991.
Sohi et al, “Multiscalar Processor”, The 22nd Int'l Symp. on Computer Architecture, IEEE Computer Society Press, 1995, pp. 414-425.
Dubey et al, “Single Speculative Multi-Threading (SPSM) Architecture: Compiler-assisted Fine-Grained Multi-threading”, Parallel Architecture and Compilation Techniques, IFIP, 1995 p. 109-121.
Culler. Culler, David E., et al. “Fine-grain Parallelism with Minimal Hardware Support: A Compiler-Controlled Threaded Abstract Machine”, Sep. 1991.*
Cooper. Cooper, Eric C., et al.. “C Threads”, Sep. 1990.*
Bohm. Bohm, A.P.W. et al. “An Evaluation of Bottom-Up and Top-Down Thread Generation Techniques”, 1993.*
Kleiman, Steve. “Programming with Threads”, Nov. 1996.*
Northcutt, J. Duane. “The Alpha Operating System: Programming Model”, Feb. 15, 1988.*
Philbin, James. “An Overview of the Sting Operating System”, Oct. 1992.
Bullock, Jr. Lewis A.
Foley & Lardner
LandOfFree
Multi-processor system executing a plurality of threads... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multi-processor system executing a plurality of threads..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multi-processor system executing a plurality of threads... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2885769