Electrical computers and digital processing systems: processing – Processing architecture – Microprocessor or multichip or multimodule processor having...
Reexamination Certificate
1998-03-18
2002-07-23
Pan, Daniel H. (Department: 2183)
Electrical computers and digital processing systems: processing
Processing architecture
Microprocessor or multichip or multimodule processor having...
C712S210000, C712S221000, C712S223000, C712S033000, C711S201000, C711S212000, C710S120000, C710S120000
Reexamination Certificate
active
06425070
ABSTRACT:
BACKGROUND OF THE INVENTION
I. Field of the Invention
The present invention relates to digital signal processors. More specifically, the present invention relates to digital signal processing using highly parallel, highly pipelined, processing techniques.
II. Description of the Related Art.
Digital Signal Processors (DSPs) are generally used for real time processing of digital signals. A digital signal is typically a series of numbers, or digital values, used to represent a corresponding analog signal. DSPs are used in a wide variety of applications including audio systems such as compact disk players, and wireless communication systems such as cellular telephones.
A DSP is often considered to be a specialized form of microprocessor. Like a microprocessor, a DSP is typically implemented on a silicon based semiconductor integrated circuit. Additionally, as with microprocessors, the computing power of DSPs is enhanced by using reduced instruction set (RISC) computing techniques. RISC computing techniques include using smaller numbers of like sized instructions to control the operation of the DSP, where each instruction is executed in the same amount of time. The use of RISC computing techniques increases the rate at which instruction are performed, or the clock rate, as well as the amount of instruction pipelining within the DSP. This increases the overall computing power of the DSP.
Configuring a DSP using RISC computing techniques also creates undesirable characteristics. In particular, RISC based DSPs execute a greater number of instructions to perform a given task. Executing additional instructions increases the power consumption of the DSP, even though the time to execute those instructions decreases due to the improved clocking speed of a RISC based DSP. Additionally, using a greater number of instructions increases the size of the on-chip instruction memory within the DSP. Memory structures require substantial (often more than 50% of the total) circuit area within a DSP, which increases the size and cost of the DSP. Thus, the use of RISC based DSPs is less than ideal for low cost, low power, applications such as digital cellular telephony or other types of battery operation wireless communication systems.
FIG. 1
is a highly simplified block diagram of a digital signal processor configured in accordance with the prior art. Arithmetic logic unit (ALU)
16
is coupled to ALU register bank
17
and multiply accumulate (MAC) circuit
26
is coupled to MAC register bank
27
. Data bus
20
couples MAC register bank
27
, ALU register
17
and (on chip) data memory
10
. Instruction bus
22
couples MAC register bank
27
, (on-chip) instruction memory
12
, MAC register bank
27
and ALU register bank
17
. Instruction decode
18
is coupled to MAC
26
and ALU
16
, and in some prior art systems instruction decode
18
is coupled directly to instruction memory
12
. Data memory
10
is also coupled to data interface
11
and instruction memory
12
is also coupled to instruction interface
13
. Data interface
12
and instruction interface
12
exchange data and instructions with off-chip memory
6
.
During operation, the instructions in instruction memory
12
are decoded by instruction decode
18
. In response, instruction decode
18
generates internal control signals that are applied to ALU
16
and MAC
26
. The control signals typically cause ALU
16
to have data exchanged between ALU register bank
17
and data memory
10
or instruction memory
12
. Also, the control signals cause MAC
26
to have instruction data exchanged between MAC register bank
27
and instruction memory
12
or data memory
10
. Additionally, the control signals cause ALU
16
and MAC
26
to perform various operations in response to, and on, the data stored in ALU register bank
17
and MAC register bank
27
respectively.
In an exemplary operation, instruction memory
12
may contain coefficient data for use by ALU
16
and MAC
26
and data memory
10
may contain data to be processed (signal data). The coefficient data may be for implementing a frequency filter using the DSP, which is a common practice. As the filtering is performed, both the signal data from data memory
10
and the coefficient data from instruction memory
12
are read into MAC register
27
. Additional instruction data within instruction memory
12
is also applied to instruction decode
18
, either through instruction data bus
22
or through a direct connection. The additional instruction data specifies the operation to be performed by MAC
26
. The results generated by MAC
26
are typically read back into data memory
10
.
Many processing inefficiencies result from this prior art processing. These processing inefficiencies include, e.g., bus, or access contention, to instruction memory
12
, which must supply instruction data to both MAC register
26
and instruction decode
18
, as well as bus, or access contention, to data memory
10
, which must both read out signal data and write in the output data. Additionally, in many instances, additional processing on the output data must be performed by ALU
16
. This further aggravates access to data memory
10
, and therefore creates contention for data bus
20
, because the output data must be written from MAC register bank
27
into data memory
10
, and then read out to ALU register
17
. These read and write operations are performed over bus
20
and therefore consume additional bus cycles. Such inefficiencies reduce the processing performance of the DSP.
The present invention seeks to improve the performance and usefulness of a DSP by addressing the problems and inefficiencies listed above, as well as by providing other features and improvements described throughout the application.
SUMMARY OF THE INVENTION
The present invention is a novel and improved method and circuit for digital signal processing. One aspect of the invention calls for the use of a variable length instruction set. A portion of the variable length instructions may be stored in adjacent locations within memory space with the beginning and ending of instructions occurring across memory word boundaries. Furthermore, additional aspects of the invention are realized by having instructions contain variable numbers of instruction fragments. Each instruction fragment causes a particular operation, or operations, to be performed allowing multiple operations during each clock cycle. Thus, multiple operations are performed during each clock cycle, reducing the total number of clock cycles necessary to perform a task.
The exemplary DSP includes a set of three data buses over which data may be exchanged with a register bank and three data memories. The use of more than two data buses, and especially three data buses, realizes another aspect of the invention, which is significantly reduced bus contention. One embodiment of the invention calls for the data buses to include one wide bus and two narrow buses. The wide bus is coupled to a wide data memory and the two narrow buses are coupled to two narrow data memories.
Another aspect of the invention is realized by the use of a register bank that has registers accessible by at least two processing units. This allows multiple operations to be performed on a particular set of data by the multiple processing units, without reading and writing the data to and from a memory. The processing units in the exemplary embodiment of the invention include an arithmetic logic (ALU) and a multiply-accumulate (MAC) unit. When combined with the use of the multiple bus architecture, highly parallel instructions, or both, an additional aspect of the invention is realized where highly pipelined, multi-operation, processing is performed.
Other aspects of the invention are realized by including an instruction fetch unit that receives instructions of variable length stored in an instruction memory. Still another aspect of the invention is realized by an instruction memory that is separate from the set of three data memories. An instruction decoder decodes the instructions from the instr
John Deepu
Kang Inyup
Lee Way-Shing
Motiwala Quaeed
Sih Gilbert C.
Brown Charles D.
Greenhaus Bruce W.
Pan Daniel H.
Qualcomm Inc.
Wadsworth Philip
LandOfFree
Variable length instruction decoder does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Variable length instruction decoder, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Variable length instruction decoder will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2839980