System and computer-implemented method for performing...

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06505224

ABSTRACT:

FIELD OF THE INVENTION
The invention relates generally to the field of systems and computer-implemented methods for generating transforms of vectors, and more specifically to systems and computer-implemented methods for efficiently generating Walsh transforms.
BACKGROUND OF THE INVENTION
The Walsh transform is used in a number of areas such as image processing, communications and the like as a fast way to generate an approximation to the fast Fourier transform (“FFT”). Recently, the Walsh transform has also been used in cryptography and in testing the randomness of sequences of pseudo-random numbers, for which it is often necessary to generate the Walsh transform of large data sets, sometimes on the order of a billion or more data items. Accordingly, it is desirable to be able to generate a Walsh transform as efficiently as possible.
Generally, the Walsh transform of a data set f(x) containing “N” data items is defined as:
W

(
u
)
=
1
N


x
=
0
N
-
1

f

(
x
)


i
=
0
n
-
1

-
1
b
i

(
x
)

b
n
-
i
-
1

(
u
)
,
(
1
)
where b
i
(x) gives the i
th
bit of “x.” The Walsh transform can be generated for data sets for which “N” is a power of two. The usual practice is to view the data set as a vector comprising N elements, and to generate the transform using N/2 radix-two butterflies organized in Log
2
(N) stages, with each radix-2 butterfly being a pair of add- and subtract operations, as follows:
(1) temp1=f(i
1
)+f(i
2
)
(2) temp2=f(i
1
)−f(i
2
)
(3) W(i
1
)=temp1
(4) W(i
2
)=temp2
where, at any stage, f(i
1
) and f(i
2
) are i
1
-th and i
2
-th components of the input vector or output of the previous stage, and W(i
1
) and W(i
2
) are the i
1
-th and i
2
-th components of the output of the current stage. Thus, in a computer in which the processor is constructed according to the “load-store” architecture, each butterfly requires two loads from memory (retrieving f(i
1
) and f(i
2
) for use in lines (1) and (2)), two arithmetic operations (the addition and subtraction operations in lines (1) and (2)) and two memory storage operations (lines (3) and (4)), or six operations in total. Since there are N/2 butterflies in each stage, the total number of operations per stage is 3N. Further, since there are Log
2
N stages, to generate a Walsh transform for a vector of length “N” components using radix-two butterflies, the processor would need to perform 3N Log
2
N operations. On a computer system capable of performing one memory access operation concurrently with an arithmetic operation during each processing cycle, during processing of each butterfly, two memory load operations can be performed in parallel with two arithmetic operations, and therefore the total number of processing a cycles required to perform a radix-two Walsh transform is 2N Log
2
N It will be appreciated that, in a computer the processor can over-write the input vector with the output Walsh transform vector in memory, thereby reducing the amount of storage space required for the Walsh transform operation.
The number of operations required to be performed to generate a Walsh transform can be reduced significantly if higher-radix butterflies are used. If, for example, a radix-4 butterfly
(1) x
1
=f(i
1
)+f(i
2
)
(2) x
2
=f(i
1
)−f(i
2
)
(3) x
3
=f(i
3
)+f(i
4
)
(4) x
4
=f(i
3
)−f(i
4
)
(5) y
4
=x
1
+x
3
(6) y
2
=x
1
−x
3
(7) y
3
=x
2
+x
4
(8) y
4
=x
2
−x
4
(9) W(i
3
)=y
1
(10) W(i
2
)=Y
2
(11) W(i
3
)=Y
3
(12) W(i
4
)=y
4
is used, the Walsh transform would be generated using Log
4
N stages, with each stage containing N/4 butterflies. In that case, each butterfly would require eight memory accesses (that is, load and store operations, reflected in lines (1) through (4) and (9) through (12)) and eight arithmetic operations (reflected in lines (1) through (8)), requiring 4N Log4N (which corresponds to 2N Log
2
N) operations for all of the butterflies to generate the entire transform. On a computer system capable of performing one memory operation concurrently with an arithmetic operation, the total number of processing cycles required to perform the radix-four Walsh transform is 2N Log
4
N.
Similarly, if a radix-8 butterfly
(1) x
1
=f(i
1
)+f(i
2
)
(2) x
2
=f(i
1
)−f(i
2
)
(3) x
3
=f(i
3
)+f(i
4
)
(4) x
4
=f(i
3
)−f(i
4
)
(5) x
5
=f(i
5
)+f(i
6
)
(6) x
6
=f(i
5
)−f(i
6
)
(7) x
7
=f(i
7
)+f(i
8
)
(8) x
8
=f(i
7
)−f(i
8
)
(9) y
1
=x
1
+x
3
(10) y
2
=x
1
−X
3
(11) y
3
=x
5
+x
7
(12) y
4
=x
5
−x
7
(13) y
5
=x
2
+x
4
(14) y
6
=x
2
−x
4
(15) y
7
=x
6
+x
8
(16) y
8
=x
6
−x
8
(17) W(i
1
)=y
1
+y
3
(18) W(i
2
)=y
5
+y
7
(19) W(i
3
)=y
2
+y
4
(20) W(i
4
)=y
2
+y
5
(21) W(i
5
)=y
1
−y
3
(22) W(i
6
)=y
5
−y
7
(23) W(i
7
)=y
2
−y
4
(24) W(i
8
)=y
6
−y
8
is used, the number of operations is (Log
8
N)(N/8)(24 arithmetic operations+16 memory accesses), or 5N Log
8
N operations. Similarly to the case with a radix-four butterfly, as described above, on a computer system capable of performing one memory access concurrently with an arithmetic operation the total number of processing cycles required to perform the radix-eight Walsh transform is 3N Log
8
N. This corresponds to the number of processing cycles required for the radix-four Walsh transform, but in the radix-eight Walsh transform the difference in time between the time the data are loaded and the time they are used in processing is larger than in the case of the radix-four Walsh transform, and so the radix-eight Walsh transform can generally be implemented more efficiently.
Generally, use of higher-radix butterflies can further reduce the number of operations required to be performed to generate a Walsh transform. In addition, depending on the architecture and internal resources of the particular processor, such as the number of registers and the size of its cache, typically the processor will be able to reduce the number of operations for higher-radix butterflies. It will be appreciated, however, that beyond a radix, the number of results that would need to be stored internally (generally, the y
n
values in the descriptions above) in order to take advantage of the reduced number of operations would be greater than the internal resources available. When that occurs, those results would need to be stored externally of the processor, resulting in a leveling off of the advantage that might come from higher-radix butterflies.
SUMMARY OF THE INVENTION
The invention provides a new and improved system and computer-implemented method for efficiently generating Walsh transforms of input vectors.
In brief summary, the invention provides a system for generating a Walsh transform output vector from an “N”-component input vector includes a vector store, a plurality of Walsh transform kernels and a control module. The vector store is configured to store the input vector The Walsh transform kernels are configured to generate a Walsh transform of a predetermined radix, with at least two of the Walsh transform kernels generating respective Walsh transforms of different radices A and B, B<A. The control module is configured to determine a factorization N=A
a
B
b
, and, in each of “a” stages associated with the radix-A Walsh transform kernel, and “b” stages associated with the radix-B Walsh transform kernel, determine a stride value for the stage, and in each of several iterations, use the stride value to select from the vector store ones of the vector components to be processed during the iteration, use the one of the radix-A or radix-B Walsh transform kernel associated with the stage in connection with the selected vector components, and store the res

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and computer-implemented method for performing... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and computer-implemented method for performing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and computer-implemented method for performing... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3048224

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.