Processor controller for accelerating instruction issuing rate

Electrical computers and digital processing systems: processing – Processing architecture – Array processor

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C712S010000, C712S011000, C712S022000, C712S214000, C712S215000, C712S241000, C712S208000

Reexamination Certificate

active

06625722

ABSTRACT:

This invention broadly relates to parallel processing in the field of computer technology, and more particularly concerns systems, devices and methods for generating instructions for a parallel computer such as a Single Instruction Multiple Data (SIMD) data processor.
Parallel processing is increasingly used to meet the computing demands of the most challenging scientific and engineering problems, since the computing performance required by such problems is usually several orders of magnitude higher than that delivered by general-purpose serial computers. Growth in parallel processing has opened up a broad spectrum of application areas including image processing, artificial neural networks, weather forecasting, and nuclear reactor calculations.
Whilst different parallel computer architectures support differing modes of operation, in very general terms, the core elements of a parallel processor include a network of processing elements (PEs) each having one or more data memories and operand registers, with each of the PEs being interconnected through an interconnection network (IN).
One of the most extensively researched approaches to parallel processing concerns Array Processors, which are commonly embodied in single instruction stream operating on multiple data stream processors (known as Single Instruction Multiple Data or SIMD processors). The basic processing units of an SIMD processor are an array of processing elements (PEs), memory elements (M), a control unit (CU), and an interconnection network (IN). In operation, the CU fetches and decodes a sequence of instructions from a program, then synchronises all the PEs by broadcasting control signals to them. In turn, the PEs, operating under the control of a common instruction stream, simultaneously execute the same instructions but on the different data that each fetches from its own memory. The interconnection network facilitates data communication among processing units and memory. Thus the key to parallelism in SIMD processors is that one instruction operates on several operands simultaneously rather than on a single one.
In a standard set-up, an SIMD processor is attached to a host computer, which, from the user's point of view, is a front-end system. The role of the host computer is to perform compilation, load programs, perform input/output (I/O) operations, and execute other operating system functions.
An example of an SIMD processor which is made and sold by the Applicant, the Aspex™ ASP™ (Associative String Processor) data processor, can in typical configurations operate on 1000 to 100,000 data items in parallel. The major features of current implementations of the ASP are:
256 processing elements on a single device 8.1 mm×9.3 mm in size to 1152 processing elements on a single device 14.5 mm×13.5 mm in size.
DPC interface 80-82 bits wide operating at 20M-50M instructions per second (20-50 MIPS).
40-100MHz clock speed.
An ASP that has been implemented is controlled by a 76-bit wide instruction consisting of 32-bit Control, 32-bit Data and 12-bit Activity fields. The ASP performs two (sequentially executed) operations for every instruction received. To support this, the control field is further subdivided into the sub-instruction fields A and B. Data I/O to the ASP uses high-speed channels, but it also can return a 32-bit wide value to a control unit, and has four status lines that can be monitored by a control unit.
The ASP can perform operations on data in the APE in bit serial (one bit at a time) or bit parallel (many bits at a time). Operations are classed as Scalar-Vector when one operand is the same value on all APEs or Vector-Vector for all other cases. Vector-Vector operations require the control unit to supply the operand addresses in the instruction and are normally performed bit serial. Scalar-Vector operations require the control unit to supply the common, i.e. scalar, operand's value and the address of the second operand in the instruction and are performed bit serial or parallel. Both cases require that the address of the result is also included in the instruction.
For the purposes of controlling an SIMD processor, the range of architectures can be considered to be bounded by two basic cases: standalone and co-processor. Other architectures are either variations, a blend or multiple instances of the two basic cases. A control unit common to standalone, co-processor and intermediary architectures is a Data Processor Controller (DPC). As will become apparent, a DPC executes the control statements of a program and issues instructions to the SIMD processor.
The standalone arrangement which is shown in
FIG. 1
of the accompanying drawings consists of two blocks: the SIMD processor which manipulates data, and the DPC which issues instructions to the SIMD processor and thereby controls the operation of the SIMD processor. A characteristic of the standalone case is that data I/O is direct to the SIMD processor. Optional external commands and status checks go to and from the DPC.
The co-processor arrangement which is shown in
FIG. 2
consists of a SIMD processor coupled via a DPC to a more conventional processor embodied in a single instruction stream operating on single data stream processor (also known as a Single Instruction Single Data or SISD processor). The combination of the DPC and the SIMD processor can be regarded as a co-processor to the SISD processor.
SISD processors can range in complexity from a processor core like the ARM, through microprocessors like the Intel Pentium or the Sun SPARC, up to complete machines like an IBM/Apple PC or a Sun/DEC workstation (all trade marks acknowledged).
During the execution by an SISD of a given program, the organisation of the system is such that the SISD delegates certain tasks along with their parameters to the co-processor. The division of this task between the DPC and the SIMD processor is the same as for the standalone case. While the co-processor is performing its assigned task, the SISD processor continues executing the program; the overall result being that the program steps are completed faster than if the SISD processor alone had been relied upon to execute the program. For example, in an image processing application, a program contains a statement which divides all the pixels in an image by the value X, the SISD processor will assign this statement and the value X to the co-processor for execution. Similarly if, say, another part of the program performs a two dimension convolution on the image, this task would also be assigned to the co-processor for execution.
Notably, the major attributes of a DPC are:
Supply instructions to the SIMD processor at a very high rate, typically 20-100M instructions per second.
Generate wide instructions, typically a couple of hundred bits.
Process status information from the data processor.
At present, known DPCs fall into one of two general categories: (i) direct microprocessor drive and (ii) custom micro-code sequencer.
Direct microprocessor drive provides a versatile and simple DPC solution employing software running on a stored-program microprocessor or digital signal processor (DSP) device to generate and assemble the data processor instructions.
FIG. 3
shows such a solution. The SIMD processor's M-bit wide instruction and N-bit wide status/result interfaces are connected via registers to a P-bit wide interface to the address/data bus or I/O channel of the microprocessor/DSP, and in general, M and/or N will be larger than P. In use, the software program builds each data processor instruction by writing it P bits at a time and once all M bits have been written, the instruction is issued. Similarly, the N-bit status/result data is read in segments.
The versatility of the direct microprocessor drive approach comes from the direct generation of the data processor instructions by the microprocessor/DSP. However, its main disadvantage is the poor instruction generation speed caused by the need to write a number of P-bit words to generate each M-bit instruction, and the relatively poor writ

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Processor controller for accelerating instruction issuing rate does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Processor controller for accelerating instruction issuing rate, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Processor controller for accelerating instruction issuing rate will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3091143

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.