Reduction of execution times for convolution operations

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06295545

ABSTRACT:

BACKGROUND
1. Field of the Invention
This invention relates to methods and systems that efficiently use a processor to perform convolution operations, and particularly to communication systems such as modems that perform such operations in interpreting communication signals.
2. Description of Related Art
Many digital signal processing systems execute routines that require repeated summing of products. For example, communication systems including modems often convert communication signals to sequences or digital samples and then filter and transform the sequences to extract data. Encoding data for transmission requires similar digital processing. The specific processing required depends on the particular protocols implemented in a communication signal, but such processing often includes finite impulse response (FIR) filtering, fast Fourier transforms (FFTs), and spectral analysis. The term “convolution operation” as used herein refers to FIR filters, FFTs, and similar processing operations that determine of a series of sums. Equation 1 illustrates a sum basic for a convolution operation.
Equation 1:

s

(
k
)
=

i
=
-



s
in

(
i
)
*
s
k

(
i
)
In Equation 1, s
in
is an input sequence, s
k
is a sequence that depends on index k, and s is a sequence that is the convolution of sequences s
in
and s
k
. Although Equation 1 shows the range of the summation as infinite, the actual range is limited to a range of interest or a range where both sequences s
k
(i) and s
k
(i) are non-zero. An example application is a Fourier transform where sequence s
in
is time-domain sequence to be transformed, sequence s
k
represents a function associated with frequency k, and sequence s is the Fourier transform of s
in
. Complete generation of the convolution s requires repeating the sum of Equation 1 for every value of k that is of interest.
A finite impulse response (FIR) filter is a type of convolution commonly used for filtering sequences. Equation 2 shows a summation for an M-tap FIR filter.
Equation 2:

F

(
k
)
=

i
=
0
M
-
1

C

(
i
)
*
S

(
k
-
i
)
In Equation 2, values C(0) to C(M−1) are filter coefficients of the FIR filter, values S(k−i) are from a sequence being filter, and value F(k) is the filtered value for index value k. Generally, a FIR filter operation repeats the summation of Equation 2 N times to generate as a sequence of filtered values F(k) for index k equal to 0 to N−1.
Table 1 contains a C programming language routine that generates N filtered values using an M-tap FIR filter operation according to Equation 2.
TABLE 1
for (k=0; k<N; k++)
, {
F[k] = 0;
for (i = 0; i<M; i++)
F[k] += C[i]*S[k−i]; /*sum instruction*/
}
The routine of Table 1 executes a sum instruction N*M times. In a typical processing system, repeated execution of the sum instruction requires 2*M*N memory loads for filter coefficients C[i] and input values S[k=i], M*N multiplications, M*N additions, and M*N memory stores for accumulate filtered values F[k]. The routine of Table 1 is memory I/O bound in that the number of memory accesses (3M*N) is greater than the number of arithmetic operations (2M*N).
Memory access times are critical to the performance of a memory I/O bound operations such as the FIR filter routine of Table 1. However, for most systems, memory access times are slow because a typical memory system for a processor is much slower than the processor's operating frequency. Fast memory caches improve access times, but when N or M is large, the cache may be insufficient to hold the filter coefficients and the data being filtered. Thus, cache misses result, and the filter operations require frequent access to slower memory. Processes that require fewer memory accesses and improve performance of convolution operations are sought.
SUMMARY
In accordance with the invention, efficient use of a processor's registers reduces the number of memory access required for a series of sums during a convolution operation. For example, for an FIR filter operation, a processor loads filter coefficients into separate registers in a register file and a portion of the sequence to be filtered into other registers in the register file. The filter coefficients remain intact in the register file for a series of summations that determine a series of filtered values. For each filtered value, as few as a single new value from input sequence is loaded into the register file. The new value replaces the eldest input value currently in the register file. Thus after the first summation, each summation requires only one new value loaded.
In accordance with one embodiment of the invention, a method for performing a convolution operation, includes loading first and second sequences of values into first and second sets of registers in a register file of a processor. The processor then determines a first sum by summing a set of products, wherein each product results from a multiplication of a value from the first set of registers and a value from the second set of registers. After determining the first sum, the processor loads a new value into the second set of registers to replace one of the values that was previously loaded into the second set of registers. After loading the new value, the processor sums a new set of products, wherein each product results from a multiplication of a value from the first set of registers and a value from the second set of registers. One of the products in the new set includes a product of a value from the first sequence and the new value. A series of sums can be determined in this fashion where each sum is determined from values loaded for previous sums and one or more new values loaded for the sum. When the convolution is an FIR operation, the first set of values includes filter coefficients, and the second set of values includes part of an input data sequence to be filter. Each new sum retains the filter coefficients in the register file and requires only one new value from the input data sequence.
When the register file of a processor is too small to contain all the values required for each sum, the convolution operation can be broken into smaller convolution operations. The processor performs each of the smaller convolution operations in the manner described above and generates partial sums. Combining the partial sums provides the results of the original convolution operation.


REFERENCES:
patent: 5297069 (1994-03-01), Asato et al.
patent: 5513223 (1996-04-01), Shoji
patent: 5648922 (1997-07-01), Han
patent: 5886913 (1999-03-01), Marguinaud et al.
patent: 5904731 (1999-05-01), Matsui
patent: 5966314 (1999-10-01), Lee

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Reduction of execution times for convolution operations does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Reduction of execution times for convolution operations, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reduction of execution times for convolution operations will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2446257

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.