High speed and efficient matrix multiplication hardware module

Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details High speed and efficient matrix multiplication hardware module High speed and efficient matrix multiplication hardware module

: 2007-07-19
: 2011-11-01
: Bullock, Jr., Lewis (Department: 2193)
: Electrical computers: arithmetic processing and calculating
: Electrical digital calculating computer
: Particular function performed

: C708S520000
: Reexamination Certificate
: active
: 08051124
: ABSTRACT:
A matrix multiplication module and matrix multiplication method are provided that use a variable number of multiplier-accumulator units based on the amount of data elements of the matrices are available or needed for processing at a particular point or stage in the computation process. As more data elements become available or are needed, more multiplier-accumulator units are used to perform the necessary multiplication and addition operations. To multiply an N×M matrix by an M×N matrix, the total (maximum) number of used MAC units is “2*N−1”. The number of MAC units used starts with one (1) and increases by two at each computation stage, that is, at the beginning of reading of data elements for each new row of the first matrix. The sequence of the number of MAC units is {1, 3, 5, . . . , 2*N−1} for computation stages each of which corresponds to reading of data elements for each new row of the left hand matrix, also called the first matrix. For the multiplication of two 8×8 matrices, the performance is 16 floating point operations per clock cycle. For an FPGA running at 100 MHz, the performance is 1.6 Giga floating point operations per second. The performance increases with the increase of the clock frequency and the use of larger matrices when FPGA resources permit. Very large matrices are partitioned into smaller blocks to fit in the FPGA resources. Results from the multiplication of sub-matrices are combined to form the final result of the large matrices.

REFERENCES:
patent: 3001710 (1961-09-01), Haynes
patent: 3055586 (1962-09-01), Davis
patent: 3157779 (1964-11-01), Cochrane
patent: 3535694 (1970-10-01), Anacker et al.
patent: 3621219 (1971-11-01), Washizuka et al.
patent: 4588255 (1986-05-01), Tur et al.
patent: 5226171 (1993-07-01), Hall et al.
patent: 5818532 (1998-10-01), Malladi et al.
patent: 5903312 (1999-05-01), Malladi et al.
patent: 5978895 (1999-11-01), Ogletree
patent: 6014144 (2000-01-01), Nelson et al.
patent: 6061749 (2000-05-01), Webb et al.
patent: 6141013 (2000-10-01), Nelson et al.
patent: 6195674 (2001-02-01), Elbourne et al.
patent: 6349379 (2002-02-01), Gibson et al.
patent: 6421695 (2002-07-01), Bae et al.
patent: 6640239 (2003-10-01), Gidwani
patent: 6681052 (2004-01-01), Luna et al.
patent: 6877043 (2005-04-01), Mallory et al.
patent: 6882634 (2005-04-01), Bagchi et al.
patent: 6888844 (2005-05-01), Mallory et al.
patent: 6891881 (2005-05-01), Trachewsky et al.
patent: 6898204 (2005-05-01), Trachewsky et al.
patent: 6912638 (2005-06-01), Hellman et al.
patent: 6954800 (2005-10-01), Mallory
patent: 6965816 (2005-11-01), Walker
patent: 6968454 (2005-11-01), Master et al.
patent: 6975655 (2005-12-01), Fischer et al.
patent: 6986021 (2006-01-01), Master et al.
patent: 6988236 (2006-01-01), Ptasinski et al.
patent: 6993101 (2006-01-01), Trachewsky et al.
patent: 7000031 (2006-02-01), Fischer et al.
patent: 7027055 (2006-04-01), Anderson et al.
patent: 7035285 (2006-04-01), Holloway et al.
patent: 7044911 (2006-05-01), Drinan et al.
patent: 7085683 (2006-08-01), Anderson et al.
patent: 7107464 (2006-09-01), Shapira et al.
patent: 7155613 (2006-12-01), Master et al.
Jang et al., “Energy- and Time-Efficient Matrix Multiplication on FPGAs,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 13, No. 11, Nov. 2005, pp. 1305-1319.
Campbell et al., “Resource and Delay Efficient Matrix Multiplication Using Newer FPGA Devices,” GLSVLSI, Apr. 2006, pp. 308-311.
Ju-Wook Jang et al., “Energy-and Time-Efficient Matrix Multiplication on FPGAs,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems IEEE USA, vol. 13, No. 11, Nov. 2005, pp. 1305-1319.
Campbell S.J. et al., “Resource and Delay Efficient Matrix Multiplication Using Newer FPGA Devices,” GLSVLSI, Apr. 30-May 2, 2006, Philadelphia, PA, pp. 308-311.
Dou Y. et al., “64-Bit Floating-Point FPGA Matrix Multiplication,” FPGA, Feb. 20-22, 2005, Monterey, CA, pp. 86-95.
Ling Zhuo et al., “Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems,” IEEE Transactions on Parallel and Distributed Systems, IEEE Service Center, Los Alamitos, CA, US, vol. 18, No. 4, Apr. 1, 2007, pp. 433-448.
European Search Report dated Jun. 24, 2009, cited in European Patent Application No. 08159750.2.
Masato Nagamatsu et al., “A 15-ns 32×32-b CMOS Multiplier with an Improved Parallel Structure,” IEEE Journal of Solid-State Circuits, vol. 25, No. 2, Apr. 1990.
F. Bensaali et al., “Accelerating Matrix Product on Reconfigurable Hardware for Image Processing Applications,” IEE Proc.-Circuits Devices Syst., vol. 152, No. 3, Jun. 2005.
Grazia Lotti et al., “Application of Approximating Algorithms to Boolean Matrix Multiplication,” IEEE Transactions on Computers, vol. C-29, No. 10, Oct. 1980.
Ju-Wook Jang et al., “Energy- and Time-Efficient Matrix Multiplication on FPGAs,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 13, No. 11, Nov. 2005.
Manojkumar Krishnan et al., “SRUMMA: A Matrix Multiplication Algorithm Suitable for Clusters and Scalable Share Memory Systems,” Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS'04), 2004 IEEE.
Keqin Li, “Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers,” 2000 IEEE.
Keqin Li et al., “Fast and Processor Efficient Parallel Matrix Multiplication Algorithms on a Linear Array with a Reconfigurable Pipelined Bus System,” IEEE Transactions on Parallel and Distributed Systems, vol. 9, No. 8, Aug. 1998.
Yun Yang et al., “High-Performance Systolic Arrays for Band Matrix Multiplication,” 2005 IEEE.

Affiliated with

Fitzgerald Dennis

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Salama Assem

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Salama Yassir

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Bullock, Jr. Lewis

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Edell Shapiro & Finnan LLC

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

ITT Manufacturing Enterprises Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Sandifer Matthew

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

High speed and efficient matrix multiplication hardware module does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with High speed and efficient matrix multiplication hardware module, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High speed and efficient matrix multiplication hardware module will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-4303708

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure