Electrical computers and digital processing systems: processing – Processing control – Processing control for data transfer
Reexamination Certificate
2000-06-16
2002-05-28
Kim, Kenneth S. (Department: 2183)
Electrical computers and digital processing systems: processing
Processing control
Processing control for data transfer
C711S149000, C711S220000
Reexamination Certificate
active
06397324
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates generally to improvements in array and indirect very long instruction word (iVLIW) processing, and more particularly to an advantageous data address generation architecture for a VLIW processor with separate compute and address register files that makes possible efficient variable length, run-length, and zigzag decoding in a programmable VLIW processor.
BACKGROUND OF THE INVENTION
A typical register-based processor architecture utilizes a general purpose register file (GPRF) to contain all the arithmetic operands used in performing computations, all computed results, and the various components, such as base, index, modulo values, and the like, used in resolving effective data or instruction addresses. More complex processors, VLIW processors in particular, may contain multiple arithmetic functional units as well as separate load and store units, thus increasing the number of ports required on the GPRF to provide simultaneous access to all the necessary operands. The GPRF grows increasingly difficult and expensive to implement as the number of ports rises, so it may be advantageous to split the GPRF into two or more separate register files and designate that the separate files serve specific purposes such as a compute register file and an address register file.
A complication arises with this approach, though, for high-performance data-dependent memory addressing operations. This problem is that the data dependent values, used for certain types of addressing, are produced in the compute register file separate from the address register and address generation functions. For example, look up table (LUT) operations use a data value as an offset into a table of values stored in memory to transform the data value into the looked-up value. This would seem to require another read port from the compute register file to provide an efficient table look-up operation. Since efficient handling of look up tables (LUTs) is of crucial importance for many applications, an efficient solution to the look up table problem is needed in processors where the compute and address registers are in separate files. A related problem is how to efficiently accomplish sequential variable length code (VLC) decoding and other front-end sequential video compression processing on an indirect VLIW (iVLIW) processor. The present invention when operating on an iVLIW processor advantageously provides a solution to these and other problems.
SUMMARY OF THE INVENTION
Table look-up and store operations are used in many digital signal processor (DSP) applications. They typically require an addressing mode such that a “base” register is used to point to the beginning of a table in memory and a data element stored in a separate register provides the offset into the table. The data type to be accessed (byte, half-word, word, double-word, etc.) determines the scaling of the offset as well as the size of the transfer. A data element may then be loaded or stored to or from the table in memory. These operations may be generally represented in the following way:
R
t
←Memory[
A
b
+R
i
]; For table load
R
s
→Memory[
A
b
+R
i
]; For table load
Where R
t
is a target compute register, R
s
is a source compute register, A
b
is a base (address) register, and R
i
is a compute register which contains a computed value which is used as an offset. The Memory[address] represents, for a load operation, the value stored in memory at the address within the brackets, and Memory[address], for a store operation, represents the location in memory at which the data R
s
is to be stored.
In the ManArray iVLIW architecture, the address and compute registers, A
b
and R
i
respectively, are in separate register files. Further, the array processor executes in pipeline fashion having at least a fetch, decode, and execute cycle to process instructions. An important question then is how to perform an efficient table-lookup or table store operation that uses registers from both files without increasing the number of read/write ports to the compute register file? With minimal programming conventions or restrictions, it is possible to share the compute register file's store unit's read port during the decode pipeline stage to allow a data-dependent address calculation to occur. The resultant address can then be used during execute to load from or store to a table in the processor's local memory. Utilizing a ManArray compute register file that uses two smaller register files, for example two 16×32-bit files, provides a cycle-by-cycle reconfigurable register file with the capability of doing dual independent table look-ups and table stores.
The ability to efficiently process compressed video data is an important capability that future digital signal processors need to provide. For example, the motion picture expert group MPEG-1 and MPEG-2 standards specify video compression processes that encode a video image into a compressed serial bitstream for efficient storage and transmission. Rather than utilize special purpose hardware logic, which adds to the complexity of a design and cannot be used for any other purposes, general instruction capability is available in the ManArray processor to efficiently process the sequential codes. A number of architectural features are used including bit-operations, table look-up, table store, conditional execution, and iVLIWs. When these sequential routines are translated into assembler code in a typical general purpose processor or DSP, the routine for decoding the non-zero frequency values or AC coefficients becomes branch intensive, representing a time consuming expense for the application. Because of this time consuming sequential processing, typical prior art systems have used hardware assist approaches to implement the VLC decode function. In one aspect of the present invention, the instruction set capabilities of the ManArray processor are used, including iVLIWs, to provide efficient processing of sequential MPEG variable length codes, as discussed in greater detail below.
These and other features, aspects and advantages of the invention will be apparent to those skilled in the art from the following detailed description taken together with the accompanying drawings.
REFERENCES:
patent: 4583165 (1986-04-01), Rosenfeld
patent: 5333118 (1994-07-01), Rossmere et al.
patent: 5924117 (1999-07-01), Luick
patent: 5974528 (1999-10-01), Tsai et al.
patent: 6041387 (2000-03-01), Fleck et al.
Barry Edwin Frank
Kurak, Jr. Charles W.
Larsen Larry D.
Pechanek Gerald G.
BOPS, Inc.
Kim Kenneth S.
Priest & Goldstein PLLC
LandOfFree
Accessing tables in memory banks using load and store... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Accessing tables in memory banks using load and store..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Accessing tables in memory banks using load and store... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2867221