Method and apparatus for controlling an instruction pipeline...

Electrical computers and digital processing systems: processing – Instruction issuing – Simultaneous issuance of multiple instructions

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S219000, C711S203000

Reexamination Certificate

active

06282635

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to computing systems and, more particularly, to a method and apparatus for controlling multiple instruction pipelines.
Conventional sequential (non-pipelined, flow through) architecture computing systems issue program instructions one at a time and wait for each instruction to complete before issuing the next instruction. That ensures that the result value generated by each instruction is available for use by later instructions in the program. It also facilitates error recovery if an instruction fails to complete successfully and the program terminates abnormally. That is, since memory and register values are predictably altered in accordance with the sequence of program instructions, the problem may be corrected by restoring (backing up) the register values to the state that existed just prior to the issuance of the faulty instruction, fixing the cause of the abnormal termination, and then restarting the program from the faulty instruction. Unfortunately, these computing systems are also inefficient since many clock cycles are wasted between the issuance of one instruction and the issuance of the instruction which follows it.
Many modern computing systems depart from the sequential architectural model. A pipelined architecture allows the next instruction to be issued without waiting for the previous instruction to complete. This allows several instructions to be executed in parallel by doing different stages of the required processing on different instructions at the same time. For example, while one instruction is being decoded, the following instruction is being fetched, and the previous instruction is being executed. Even in a pipelined architecture, however, instructions still both issue and complete in order, so error recovery is still straight forward.
Even more advanced machines employ multiple pipelines that can operate in parallel. For example, a three pipeline machine may fetch three instructions every clock cycle, decode three instructions every clock cycle, and execute three instructions every clock cycle. These computing systems are very efficient. However, not all instructions take the same amount of time to complete, and some later-issued instructions may complete before instructions that issued before them. Thus, when a program terminates abnormally, then it must be determined which instructions completed before the faulty instruction terminated, and the memory and register values must be restored accordingly. That is a very complicated task and, if not handled properly, may eliminate many of the benefits of parallel processing.
One reason for instruction failure is the existence of logic or data errors which make it impossible for the program to proceed (e.g., an attempt to divide by zero). Another reason for instruction failure is an attempt to access data that is temporarily unavailable. This may occur if the computing system employs virtual addressing of data. As explained below, problems caused by virtual addressing are more difficult to overcome.
FIG. 1
is a block diagram of a typical computing system which employs virtual addressing. Computing system
10
includes an instruction issuing unit
14
which communicates instructions to a plurality of (e.g., eight) instruction pipelines
18
A-H over a communication path
22
. The data referred to by the instructions in a program are stored in a mass storage device
30
which may be, for example, a disk or tape drive. Since mass storage devices operate very slowly (e.g., a million or more clock cycles per access) compared to instruction issuing unit
14
and instruction pipelines
18
A-H, data currently being worked on by the program is stored in a main memory
34
which may be a random access memory (RAM) capable of providing data to the program at a much faster rate (e.g., 30 or so clock cycles). Data stored in main memory
34
is transferred to and from mass storage device
30
over a communication path
42
. The communication of data between main memory
34
and mass storage device
30
is controlled by a data transfer unit
46
which communicates with main memory
34
over a communication path
50
and with mass storage device
30
over a communication path
54
.
Although main memory
34
operates much faster than mass storage device
30
, it still does not operate as quickly as instruction issuing unit
14
or instruction pipelines
18
A—H. Consequently, computing system
10
includes a high speed cache memory
60
for storing a subset of data from main memory
34
, and a very high speed register file
64
for storing a subset of data from cache memory
60
. Cache memory
60
communicates with main memory
34
over a communication path
68
and with register file
64
over a communication path
72
. Register file
64
communicates with instruction pipelines
18
A-H over a communication path
76
. Register file
64
operates at approximately the same speed as instruction issuing unit
14
and instruction pipelines
18
A-H(e.g., a fraction of a clock cycle), whereas cache memory
60
operates at a speed somewhere between register file
64
and main memory
34
(e.g., approximately two or three clock cycles).
FIGS. 2A-B
are block diagrams illustrating the concept of virtual addressing. Assume computing system
10
has 32 bits available to address data. The addressable memory space is then 2
32
bytes, or four gigabytes (4 GB), as shown in FIG.
2
A. However, the physical (real) memory available in main memory
34
typically is much less than that, e.g., 1-256 megabytes. Assuming a 16 megabyte (16 MB) real memory, as shown in
FIG. 2B
, only 24 address bits are needed to address the memory. Thus, multiple virtual addresses inevitably will be translated to the same real address used to address main memory
34
. The same is true for cache memory
60
, which typically stores only 1-36 kilobytes of data. Register file
64
typically comprises, e.g., 32 32-bit registers, and it stores data from cache memory
60
as needed. The registers are addressed by instruction pipelines
18
A-H using a different addressing scheme.
To accommodate the difference between virtual addresses and real addresses and the mapping between them, the physical memory available in computing system
10
is divided into a set of uniform-size blocks, called pages. If a page contains 2
12
or 4 kilobytes (4 KB), then the full 32-bit address space contains 2
20
or 1 million (1 M) pages (4 KB×1 M=4 GB). Of course, if main memory
34
has 16 megabytes of memory, only 2
12
or 4 K of the 1 million potential pages actually could be in memory at the same time (4 K×4 KB=16 MB).
Computing system
10
keeps track of which pages of data from the 4 GB address space currently reside in main memory
34
(and exactly where each page of data is physically located in main memory
34
) by means of a set of page tables
100
(
FIG. 3
) typically stored in main memory
34
. Assume computing system
10
specifies 4 KB pages and each page table
100
contains 1 K entries for providing the location of 1 K separate pages. Thus, each page table maps 4 MB of memory (1 K×4 KB=4 MB), and 4 page tables suffice for a machine with 16 megabytes of physical main memory (16 MB/4 MB=4).
The set of potential page tables are tracked by a page directory
104
which may contain, for example, 1 K entries (not all of which need to be used). The starting location of this directory (its origin) is stored in a page directory origin (PDO) register
108
.
To locate a page in main memory
34
, the input virtual address is conceptually split into a 12-bit displacement address (VA<11:0>), a 10-bit page table address (VA<21:12>) for accessing page table
100
, and a 10-bit directory address (<VA 31:22>) for accessing page directory
104
. The address stored in PDO register
108
is added to the directory address VA<31:22> of the input virtual address in a page directory entry address accumulator
112
. The address in page directory entry address accumulator
112

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for controlling an instruction pipeline... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for controlling an instruction pipeline..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for controlling an instruction pipeline... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2474085

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.