Reducing inherited logical to physical register mapping...

Electrical computers and digital processing systems: processing – Processing control – Context preserving (e.g. – context swapping – checkpointing,...

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Reducing inherited logical to physical register mapping... Reducing inherited logical to physical register mapping...

: 1999-04-26
: 2001-12-11
: Kim, Kenneth S. (Department: 2183)
: Electrical computers and digital processing systems: processing
: Processing control
: Context preserving (e.g., context swapping, checkpointing,...

: C712S023000, C709S241000
: Reexamination Certificate
: active
: 06330661
: ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a register content inheriting system in a multi-processor. More particularly, the invention relates to a multithread microprocessor executing a plurality of instructions simultaneously.
2. Description of the Related Art
As a technology for speeding-up a program, there has been proposed a system for performing a parallel processing through a thread by dividing the program into a plurality of threads. Adapting to such thread level parallel processing, study for the processors have been progressed. The thread level parallel processing system takes a method to improve a processing speed with improving use efficiency of an arithmetic unit by executing a plurality of threads simultaneously instead of parallel characteristics of the instruction unit.
Such thread level parallel processing can be classified to one no dependency between the threads with each other for some problems to be solved at all, one having low dependency and whereby having less problem in performance even when dependency is resolved by a software and one having high dependency and thus requiring execution aid of thread level parallel processing by hardware.
When there is no dependency between the threads or when dependency between threads is low and thread is large, gain by parallel processing may be higher than an overhead of thread management by a software. Therefore, a support in a hardware can be restricted to be minimum.
However, in certain problem to be solved, dependency can become high or thread per se becomes small, some hardware support becomes necessary.
Upon speeding up of fine thread, efficient thread generation and data transfer between the threads are inherent. For example, as one example of a parallel processing multi-processor of fine threads has been disclosed “Multiscalar Processor (Gurinder S. Sohi, Scott E. Breach and T. N. Vijaykumar, The 22ns International Symposium on Computer Architecture, IEEE Computer Society Press, 1995, pp 414-425.
In Multiscalar Processor, a single program is divided into “tasks” as aggregate of basic blocks, and the “tasks” are processed by a processor which can executes those tasks in parallel. Transfer of register contents between “tasks” is designated by a task descriptor generated by a task compiler.
In the task descriptor, a register which may be generated is explicitly designated. This designation is referred to as create mask. On the other hand, for an instruction updating the register finally designated by the create mask, a forward bit is added. Thus, multiscalar processor performs parallel execution by a code depending upon decoding ability of the compiler.
One example of a construction of the multiscalar processor is shown in FIG.
24
. In
FIG. 24
, the multiscalar processor is constructed with a sequencer
6
, processing units
7
-
1
to
7
-
3
, an associative network
8
and data banks
9
-
1
to
9
-
3
.
Each of a plurality of the processing units
7
-
1
to
7
-
3
in the system is constructed with a cache
71
, an execution unit
72
and a register file. On the other hand, corresponding to the processing units
7
-
1
to
7
-
3
, a plurality of data banks
9
-
1
to
9
-
3
are provided. Each of the data banks
9
-
1
to
9
-
3
is constructed with an address resolution buffer (ARB) and data cache
91
.
Management of simultaneous execution of a plurality of tasks is performed by the sequencer
6
which assigns task to the processing units
7
-
1
to
7
-
3
. The content of each register of the register file is forwarded at a timing of data generation by designation of task descriptor.
On the other hand, in “Proposal for Directivity Control Parallel Architecture of On-chip Multiprocessor (MUSCAT)”, (Torii, Kondo, Motomura, Konagaya, Nishi, JSPP 97, pp 229 to 236, May 1997), there has been proposed a fork one time model limiting the fork for only one time during a thread life period is a period, in which one thread generates a thread by a fork instruction, and a thread execution model, performing lamp inheriting of all registers of the register file upon thread generation.
An image of the fork one time model is shown in FIG.
23
. The fork one time model generates new thread for only one time during life period of the threads #1 to #3. By introduction of this model, simplification of thread management can be realized.
Furthermore, in a technology disclosed in Japanese Unexamined Patent Publication No. 10-078880, several kinds of methods for realizing register inheriting method by the fork one time model has been disclosed. Among these inheriting method, most of the method employs a method to finally copy the register content while timings are different. However, copying of the register content causes increasing of physical amount and hindering of speeding up.
Therefore, in the technology disclosed in the above-identified Japanese Unexamined Patent Publication No. 10-078880, there has been proposed an example, realizing inheriting of the register content by providing common registers with separating the register into logical registers and physical registers and only mapping image indicative of relationship between the logical register and the physical register is copied, as out-to-order issuing system, in which instructions are issues in non-order irrespective of the program order.
An example of the construction of the processor of this type is shown in FIG.
25
. In
FIG. 25
, there is shown a construction of a two thread parallel execution type processor which is constructed with a common physical register file
126
common to thread execution units
121
a
and
121
b
, a register busy table
129
, a register free table
130
and a thread management unit
131
.
Each of the thread execution units
121
a
and
121
b
is constructed with instruction caches
122
a
and
122
b
, instruction decoders
123
a
and
123
b
, register mapping tables
124
a
and
124
b
, instruction queues
125
a
and
125
b
, arithmetic units
127
a
and
127
b
and effective instruction order buffers
128
a
and
128
b.
In the shown processor, the register is separated into a logical register to be accessed from the software and a physical register holding a register content in hardware, and a mapping relationship is held in the register mapping tables
124
a
and
124
b.
Detailed construction of the register mapping tables
124
a
and
124
b
is shown in FIG.
26
. In
FIG. 26
, the register mapping tables
124
a
and
124
b
has a physical register number entry of registers
0
to
31
to convert into register numbers “45”, “13”, “04”, “21”, -, “53”.
Upon generation of the thread, by copying the mapping information between the register mapping tables
124
a
and
124
b
, register inheriting is realized without performing copy of the register content.
In the foregoing conventional multithread microprocessor, in case of the in-order issuing type in the register inheriting system of the register, in the above-mentioned publication, it becomes necessary to copy the content of the register upon initiation of the thread and termination of the thread.
On the other hand, in case of the out-of-order issuing type, copying of the register content becomes unnecessary. However, a common register free table between the thread execution units indicative of use
on-use of the register becomes necessary to cause a problem of complication of logic and data path and increasing of data amount. On the other hand, register renaming per one instruction is required to be too wasteful in application for the in-order issuing type.
SUMMARY OF THE INVENTION
Therefore, the present invention has been worked out for solving the problems set forth above. It is an object of the present invention to provide a register content inheriting system in a multi-processor which can achieve high efficiency both for in-order issuing type and out-of-order issuing type and high performance for fine threads. In order to accomplish the above-mentioned and other objects, according to one aspect of the present invent

Affiliated with

Torii Sunao

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Kim Kenneth S.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

McGinn & Gibb PLLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

NEC Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Reducing inherited logical to physical register mapping... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Reducing inherited logical to physical register mapping..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reducing inherited logical to physical register mapping... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2595684

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure