Pulse or digital communications – Equalizers
Reexamination Certificate
2000-04-13
2004-09-07
Chin, Stephen (Department: 2634)
Pulse or digital communications
Equalizers
C375S219000, C375S232000, C708S313000
Reexamination Certificate
active
06788738
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to methods and apparatus for accelerating complex, processor-intensive signal-processing algorithms, in particular algorithms in which the evaluation depends upon a single final data point.
BACKGROUND
Some complex signal processing algorithms depend upon a single final data point to produce a processed result. One such algorithm is the finite-impulse-response (FIR) filter, which is commonly found among the algorithms evaluated by a digital signal processor (DSP).
FIG. 1
is a flowchart
10
of a direct form of a conventional FIR filter. A series of N input data samples is shifted into shift registers
11
1
through
11
N
. Thus, register
11
1
contains a current data sample D
N
and registers
11
2
through
11
N
contain a set of previous data samples D
4
, D
3
, D
2
, and D
1
.
Registers
11
1
through
11
N
present their corresponding data samples D
1
through D
N
on like-named register output lines. Data samples D
1
through D
N
are then multiplied in a set of multiply steps
12
1
through
12
N
by a respective set of weighting coefficients C
1
through C
N
. Finally, an adder
13
sums the resulting weighted samples to provide a filtered output sample D
F
, where D
F
=D
1
C
1
+D
2
C
2
. . . +D
N
C
N
. Output sample D
F
is then loaded into an output register
14
.
FIG. 2
depicts a typical hardware implementation
20
of the flowchart of
FIG. 1
, like-numbered elements being the same in both Figures. For ease of illustration,
FIG. 2
illustrates a four-tap filter employing weighting coefficients C
1
-C
4
. The depicted example is limited to five input samples D
1
-D
5
, sample D
5
being the newest and sample D
1
being the eldest. A register
11
, including five individual registers
11
1
through
11
5
, connects to a multiplier
22
via a multiplexer
24
. A register block
26
stores weighting coefficients C
1
through C
4
in a series of registers
26
1
through
26
4
and presents the coefficients to multiplier
22
via a second multiplexer
28
.
As depicted below in Table 1, the example begins with the first (eldest) data sample D
1
stored in register
11
5
, the second data sample D
2
stored in register
11
2
, the third data sample D
3
stored in register
11
3
, and the fourth and most recent data sample D
4
stored in registers
11
1
and
11
4
. A new data sample D
5
is then received and latched into input register
11
1
during the first machine cycle (Cycle 1). Multiplexers
24
and
28
then provide the respective contents of registers
11
1
and
26
1
(i.e., D
5
and C
1
) to multiplier
22
. Multiplier
22
outputs the product D
5
C
1
to an adder
25
, which stores the product D
5
C
1
in an accumulation register
29
.
TABLE 1
Register
Start
Cycle 1
Cycle 2
Cycle 3
Cycle 4
11
1
D
4
D
5
D
5
D
5
D
5
11
2
D
2
D
2
D
5
D
4
D
3
11
3
D
3
D
3
D
2
D
5
D
4
11
4
D
4
D
4
D
3
D
2
D
5
11
5
D
1
D
1
D
4
D
3
D
2
Registers
11
2
to
11
5
operate as shift registers. Data sample D
1
is shifted into register
11
2
during the time that data sample D
1
is presented to multiplier
22
. Thus, for the second machine cycle (Cycle 2), each data sample in shift register
11
is similarly shifted, so that data sample D
1
is replaced with data sample D
4
, data sample D
4
is replaced with data sample D
3
, data sample D
3
is replaced with data sample D
2
, and data sample D
2
is replaced with data sample D
5
(see Table 1).
Multiplexer
24
selects the D output D
OUT
of register
11
while multiplexer
28
selects coefficient C
2
following the foregoing multiply and shift sequence. Multiplier
22
thus supplies the product D
4
C
2
to adder
25
, which sums the product D
4
C
2
with the product D
5
C
1
already in accumulation register
29
and stores the sum (i.e., D
4
C
2
+D
5
C
1
) in accumulation register
29
. As with data sample D
5
data sample D
4
is shifted into register
11
2
while data sample D
4
is presented to multiplier
22
. Each remaining register
11
3
-
11
5
is similarly updated, so that the contents of registers
11
1
-
11
5
are as depicted above for cycle three of Table 1.
The foregoing multiply, accumulate, and shift process continues until each data/coefficient pair is presented to multiplier
22
and the resulting products are summed in accumulation register
29
and then stored in an output register
14
. Upon completing of the filtering of data sample D
5
, the contents of registers
11
1
-
11
5
are as depicted above for cycle four of Table 1. The filter is then prepared to receive the next data sample D
6
.
Filter implementation
20
requires N clock cycles to filter each data sample, or one clock cycle for each multiply-accumulate operation performed by multiplier
22
and adder
25
. Since many DSP optimized microprocessors can produce the same result in N clock cycles, such an embodiment cannot be used to accelerate the microprocessor.
Some conventional systems employ multiple multiplier/adder pairs operating in parallel to reduce the requisite number of clock cycles and therefore improve speed performance. Unfortunately, such parallel systems are larger, more expensive, and require more power than their sequential counterparts. There is therefore a need for a means of reducing the time required to complete the evaluation of the FIR-filter algorithm without incurring significant increases in power usage, size, and cost.
SUMMARY
The present invention is directed to methods and apparatus for accelerating complex signal-processing tasks, such as FIR filtering. In one embodiment, an FIR-filter accelerator is connected in parallel with a data path in a conventional DSP. The accelerator calculates and maintains a number of partial results based on a selected number of prior data samples. Each time the DSP receives a new data sample for filtering, the DSP makes use of one or more partial results from the accelerator to speed the calculation of the filtered result. The accelerator then recalculates the partial results using the new data sample in preparation for a subsequent data sample.
The filter accelerator can improve the performance of the DSP even if the accelerator hardware operates at a rate slower than that of the DSP. The accelerator can therefore be produced inexpensively by exploiting proven, mass-produced, economical technologies and materials. Moreover, the accelerator can be made relatively small, as the accelerator does not require massively parallel processing means.
REFERENCES:
patent: 6260053 (2001-07-01), Maulik et al.
Behiel Arthur Joseph
Cartier Lois D.
Chang Edith
Chin Stephen
Xilinx , Inc.
LandOfFree
Filter accelerator for a digital signal processor does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Filter accelerator for a digital signal processor, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Filter accelerator for a digital signal processor will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3215648