Electrical computers and digital processing systems: processing – Processing architecture – Microprocessor or multichip or multimodule processor having...
Reexamination Certificate
1998-03-18
2002-07-23
Pan, Daniel H. (Department: 2183)
Electrical computers and digital processing systems: processing
Processing architecture
Microprocessor or multichip or multimodule processor having...
C712S210000, C712S221000, C712S223000, C712S033000, C711S201000, C711S212000, C710S120000, C710S120000
Reexamination Certificate
active
06425070
ABSTRACT:
BACKGROUND OF THE INVENTION
I. Field of the Invention
The present invention relates to digital signal processors. More specifically, the present invention relates to digital signal processing using highly parallel, highly pipelined, processing techniques.
II. Description of the Related Art.
Digital Signal Processors (DSPs) are generally used for real time processing of digital signals. A digital signal is typically a series of numbers, or digital values, used to represent a corresponding analog signal. DSPs are used in a wide variety of applications including audio systems such as compact disk players, and wireless communication systems such as cellular telephones.
A DSP is often considered to be a specialized form of microprocessor. Like a microprocessor, a DSP is typically implemented on a silicon based semiconductor integrated circuit. Additionally, as with microprocessors, the computing power of DSPs is enhanced by using reduced instruction set (RISC) computing techniques. RISC computing techniques include using smaller numbers of like sized instructions to control the operation of the DSP, where each instruction is executed in the same amount of time. The use of RISC computing techniques increases the rate at which instruction are performed, or the clock rate, as well as the amount of instruction pipelining within the DSP. This increases the overall computing power of the DSP.
Configuring a DSP using RISC computing techniques also creates undesirable characteristics. In particular, RISC based DSPs execute a greater number of instructions to perform a given task. Executing additional instructions increases the power consumption of the DSP, even though the time to execute those instructions decreases due to the improved clocking speed of a RISC based DSP. Additionally, using a greater number of instructions increases the size of the on-chip instruction memory within the DSP. Memory structures require substantial (often more than 50% of the total) circuit area within a DSP, which increases the size and cost of the DSP. Thus, the use of RISC based DSPs is less than ideal for low cost, low power, applications such as digital cellular telephony or other types of battery operation wireless communication systems.
FIG. 1
 is a highly simplified block diagram of a digital signal processor configured in accordance with the prior art. Arithmetic logic unit (ALU) 
16
 is coupled to ALU register bank 
17
 and multiply accumulate (MAC) circuit 
26
 is coupled to MAC register bank 
27
. Data bus 
20
 couples MAC register bank 
27
, ALU register 
17
 and (on chip) data memory 
10
. Instruction bus 
22
 couples MAC register bank 
27
, (on-chip) instruction memory 
12
, MAC register bank 
27
 and ALU register bank 
17
. Instruction decode 
18
 is coupled to MAC 
26
 and ALU 
16
, and in some prior art systems instruction decode 
18
 is coupled directly to instruction memory 
12
. Data memory 
10
 is also coupled to data interface 
11
 and instruction memory 
12
 is also coupled to instruction interface 
13
. Data interface 
12
 and instruction interface 
12
 exchange data and instructions with off-chip memory 
6
.
During operation, the instructions in instruction memory 
12
 are decoded by instruction decode 
18
. In response, instruction decode 
18
 generates internal control signals that are applied to ALU 
16
 and MAC 
26
. The control signals typically cause ALU 
16
 to have data exchanged between ALU register bank 
17
 and data memory 
10
 or instruction memory 
12
. Also, the control signals cause MAC 
26
 to have instruction data exchanged between MAC register bank 
27
 and instruction memory 
12
 or data memory 
10
. Additionally, the control signals cause ALU 
16
 and MAC 
26
 to perform various operations in response to, and on, the data stored in ALU register bank 
17
 and MAC register bank 
27
 respectively.
In an exemplary operation, instruction memory 
12
 may contain coefficient data for use by ALU 
16
 and MAC 
26
 and data memory 
10
 may contain data to be processed (signal data). The coefficient data may be for implementing a frequency filter using the DSP, which is a common practice. As the filtering is performed, both the signal data from data memory 
10
 and the coefficient data from instruction memory 
12
 are read into MAC register 
27
. Additional instruction data within instruction memory 
12
 is also applied to instruction decode 
18
, either through instruction data bus 
22
 or through a direct connection. The additional instruction data specifies the operation to be performed by MAC 
26
. The results generated by MAC 
26
 are typically read back into data memory 
10
.
Many processing inefficiencies result from this prior art processing. These processing inefficiencies include, e.g., bus, or access contention, to instruction memory 
12
, which must supply instruction data to both MAC register 
26
 and instruction decode 
18
, as well as bus, or access contention, to data memory 
10
, which must both read out signal data and write in the output data. Additionally, in many instances, additional processing on the output data must be performed by ALU 
16
. This further aggravates access to data memory 
10
, and therefore creates contention for data bus 
20
, because the output data must be written from MAC register bank 
27
 into data memory 
10
, and then read out to ALU register 
17
. These read and write operations are performed over bus 
20
 and therefore consume additional bus cycles. Such inefficiencies reduce the processing performance of the DSP.
The present invention seeks to improve the performance and usefulness of a DSP by addressing the problems and inefficiencies listed above, as well as by providing other features and improvements described throughout the application.
SUMMARY OF THE INVENTION
The present invention is a novel and improved method and circuit for digital signal processing. One aspect of the invention calls for the use of a variable length instruction set. A portion of the variable length instructions may be stored in adjacent locations within memory space with the beginning and ending of instructions occurring across memory word boundaries. Furthermore, additional aspects of the invention are realized by having instructions contain variable numbers of instruction fragments. Each instruction fragment causes a particular operation, or operations, to be performed allowing multiple operations during each clock cycle. Thus, multiple operations are performed during each clock cycle, reducing the total number of clock cycles necessary to perform a task.
The exemplary DSP includes a set of three data buses over which data may be exchanged with a register bank and three data memories. The use of more than two data buses, and especially three data buses, realizes another aspect of the invention, which is significantly reduced bus contention. One embodiment of the invention calls for the data buses to include one wide bus and two narrow buses. The wide bus is coupled to a wide data memory and the two narrow buses are coupled to two narrow data memories.
Another aspect of the invention is realized by the use of a register bank that has registers accessible by at least two processing units. This allows multiple operations to be performed on a particular set of data by the multiple processing units, without reading and writing the data to and from a memory. The processing units in the exemplary embodiment of the invention include an arithmetic logic (ALU) and a multiply-accumulate (MAC) unit. When combined with the use of the multiple bus architecture, highly parallel instructions, or both, an additional aspect of the invention is realized where highly pipelined, multi-operation, processing is performed.
Other aspects of the invention are realized by including an instruction fetch unit that receives instructions of variable length stored in an instruction memory. Still another aspect of the invention is realized by an instruction memory that is separate from the set of three data memories. An instruction decoder decodes the instructions from the instr
John Deepu
Kang Inyup
Lee Way-Shing
Motiwala Quaeed
Sih Gilbert C.
Brown Charles D.
Greenhaus Bruce W.
Pan Daniel H.
Qualcomm Inc.
Wadsworth Philip
LandOfFree
Variable length instruction decoder does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Variable length instruction decoder, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Variable length instruction decoder will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2839980