Reconfigurable processor devices

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06553395

ABSTRACT:

A conventional processor (such as, for example, the Pentium II produced by Intel Corp.—Pentium is a trademark of Intel Corp.) is a general device. It is not optimised for any specific task, but is able to be programmed to perform a very wide range of functions.
The consequence of the general purpose architecture of the conventional processor is that for specific tasks, the performance of the processor will be much worse than for hardware designed to perform the specific tasks. This is because the architecture of the general purpose processor does not follow the structure of the task, but instead relies on a complex ALU (arithmetic logic unit) which is very heavily used during the task and which makes very frequent calls to its necessarily large memory resources. Where such tasks are computationally intensive, this approach is particularly inappropriate.
If there is a task which will be need to be performed on a regular basis, then an appropriate approach will be to provide circuitry optmisied specifically for that task. A typical approach is to provide such circuitry in the form of a co-processor or ASIC (application specific integrated circuit) together with the general-purpose processor, so that the tasks for which the co-processor or ASIC is optimised can be routed to the co-processor or ASIC by the general-purpose processor.
Although an ASIC may be optimal for a specific task, as it has been built for one specific task it will generally be poor or entirely non-functional for any other computational task. An advantageous possibility exists between the two extremes: on the one hand, a fixed configuration ASIC, and on the other hand, a conventional processor (for which a configuration in silicon can only be considered to exist for a single cycle). This intermediate possibility is a reconfigurable device: these have a determined configuration but allow for reconfiguration to a different determined configuration when required. Reconfigurable devices thus offer the possibility of a computer which can alter its hardware resources to service its current computational needs by appropriate reconfiguration.
A commercially successful form of reconfigurable device is the field-programmable gate array (FPGA). These devices consist of a collection of configurable processing elements embedded in a configurable interconnect network. Configuration memory is provided to describe the interconnect configuration—often SRAM is used. These devices have a very fine-grained structure: typically each processing element of an FPGA is a configurable gate. Rather than being concentrated in a central ALU, processing is thus distributed across the device and the silicon area of the device is used more effectively. An example of a commercially available FPGA series is the Xilinx 4000 series.
Such reconfigurable devices can in principle be used for any computing apposition for which a processor or an ASIC is used. However, a particularly suitable use for such devices is as a coprocessor to handle tasks which are computationally intensive, but which are not so common as to merit a purpose built ASIC. A reconfigurable coprocessor could thus be programmed at different times with different configurations, each adapted for execution of a different computationally intensive task, providing greater efficiency than for a general purpose processor alone without a huge increase in overall cost. In recent FPGA devices, scope is provided for dynamic reconfiguration, wherein partial or total reconfiguration can be provided during the execution of code so that time-multiplexing can be used to provide configurations optimised for different subtasks at different stages of execution of a piece of code.
FPGA devices are not especially suitable for certain kind of computational task. As the individual computational elements are very small, the datapaths are extremely narrow and many of them are required, so a large number of operations are required in the configuration process. Although these structures are relatively efficient for tasks which operate on small data elements and are regular from cycle to cycle, they are less satisfactory for irregular tasks with large data elements. Such tasks are also often not well handled by a general purpose processor, yet may be of considerable importance (such as in, for example, image processing). Alternative reconfigurable architectures have been proposed. One example is the PADDI architecture developed by the University of California at Berkeley, described in D. Chen and J. Rabaey, “A Reconfigurable Multiprocessor IC for Rapid Prototyping of Real Time Data Paths”, ISSCC, February 1992 and A. Yeung and J. Rabaey, “A Data-Driven Architecture for Rapid Prototyping of High Throughput DSP Algorithms”, IEEE VLSI Signal Processing Workshop, October 1992. This architecture was to the prototyping of high speed real-time DSP systems, DSP algorithms providing an example of computation not well handled either by conventional processors or FPGAs. The architecture comprises a plurality of relatively simple processing execution units connected by a reconfigurable network. Each execution unit operates at 16 bit width, has register files for the input operands, and has its own instruction memory. A 53 bit instruction word is necessary to specify the operation of an instruction unit.
In PADDI, instructions are distributed both at configuration and at run time. At configuration time, the memories, which act as control stores, are loaded with a set of instructions. At run time the addresses for all of the control stores are broadcast globally, and each of these local instruction memories retrieves its own local instruction for use by the local execution unit. In operation, communication between processing elements is data driven, and the processing elements act on data according to their local instructions.
Another alternative architecture is MATRIX, developed at the Massachussetts Institute of Technology and described in Ethan Mirsky and André deHon, “MATRIX: A Reconfigurable Computing Architecture with Configurable Instruction Distribution and Deployable Resources”, FCCM '96—IEEE Symposium on FPGAs for Custom Computing Machines, Apr. 17-19, 1996, Napa, Calif., USA, and in more detail in André deHon, “Reconfigurable Architectures for General-Purpose Computing”, pages 257 to 296, Technical Report 1586, MIT Artificial Intelligence Laboratory. MATRIX is a coarse-grained structure, in which an array of identical 8-bit functional units are interconnected with a configurable network. Each functional unit contains a 256×8-bit memory, an 8-bit ALU with address able input registers, an output register and a multiplier, and control logic. This architecture is relatively versatile, as it provides the decentralisation of processing of an FPGA while providing a broader datapath and the scope to adjust the instruction stream to what is required for a given application.
The MATRIX structure has advantageous aspects, but the course grain size means that it consumes more silicon than a conventional FPGA structure and is likely to be less efficient for tasks which are regular from cycle to cycle. It would therefore be desirable to develop further reconfigurable structures which combine as best possible the advantages of both MARTIX and of conventional FPGAs.
Accordingly, the invention provides a reconfigurable device comprising: a plurality of processing devices; a connection matrix providing an interconnect between the processing devices; and means to define the configuration of the connection matrix; wherein each of the processing devices comprises an arithmetic logic unit adapted to perform a function on input operands and produce an output, wherein said input operands are provided as inputs to the arithmetic logic unit from the interconnect on the same route in each cycle, and wherein means are provided to route the output of a first cone of the processing devices to a second one of the processing devices to determine the function performed by the second one of the processing devices.
Unlike MATRIX, this

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Reconfigurable processor devices does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Reconfigurable processor devices, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reconfigurable processor devices will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3108888

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.