Memory efficient array transposition via multi pass tiling

Computer-aided design and analysis of circuits and semiconductor – Nanotechnology related integrated circuit design

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Memory efficient array transposition via multi pass tiling Memory efficient array transposition via multi pass tiling

: 2008-07-08
: 2008-07-08
: Chiang, Jack (Department: 2825)
: Computer-aided design and analysis of circuits and semiconductor
: Nanotechnology related integrated circuit design

: C716S030000, C711S101000, C711S165000
: Reexamination Certificate
: active
: 07398484
: ABSTRACT:
A schedule can be generated for physically transposing an array such that when the array is transferred from a first memory type to a second memory type, the number of block transfers performed is minimized. The array can be rearranged to ensure that most or all data elements in any block read into internal memory are used before that block's internal storage is reused. The algorithm can compute an offline schedule and then execute that schedule. The method can assemble the schedule during one or more passes with an algorithm. Scheduling passes can apply a permutation to a representation of the array's structure and then tile the representation to ensure efficient use of internal memory. Tiling may alter the permutation, so the algorithm can be reinvoked and run on the tiled representation. The algorithm can be run on successively retiled representations until no additional tiling is required. Optimizations for schedule generation and/or execution include: simplification of the data structure representing the array; alteration of the order in which tiles are accessed; inversion of permutations; processing non-canonical input; and the production of stream output. The method can also be modified for specialized architectures, including: block I/O architecture; fully associative caches; set-associative caches; and multi-level memory hierarchies.

REFERENCES:
patent: 6438747 (2002-08-01), Schreiber et al.
patent: 6507947 (2003-01-01), Schreiber et al.
patent: 2005/0102572 (2005-05-01), Oberlaender
patent: 2006/0026154 (2006-02-01), Altinel et al.
patent: 2006/0031652 (2006-02-01), Richter et al.
patent: 2006/0069874 (2006-03-01), Desai
patent: 2006/0143390 (2006-06-01), Kottapalli
patent: 2006/0224834 (2006-10-01), O'Connor et al.
Wolfe “High Performance Compilers for Parallel Computing,” pp. 137-165, 307-366, 367-384 and 514-527, Addison-Wesley, Redwood City, CA, 1996.
Aggarwal and Vitter, “The input/output complexity of sorting and related problems,”Communications of the ACM31(9):1116-1127, Sep. 1998.
Chatterjee et al., “Recursive array layouts and fast parallel matrix multiplication,” InACM Symposium on Parallel Algorithms and Architectures, pp. 222-231, 1999.
Cormen “Virtual Memory for Data-Parallel Computing,” PhD Thesis, Massachusetts Institute of Technology, 1992, mIT/LCS/TR-559.
Cormen “Fast permuting on disk arrays,”Journal of Parallel and Distributed Computing17(1-2):41-57, 1993.
Chatterjee and Sen, “Cache-efficient matrix transposition.” InHPCA, pp. 195-205, 2000.
Frigo et al., “Cache-oblivious algorithms,” InProceedings of the 40thAnnual Symposium on Foundations of Computer Science, p. 285, 1999.
Fraser “Array permutation by index-digit permutation,”Journal of the ACM, 23(2):298-309, 1976.
Jouppi “Cache write policies and performance,” Technical Report 91/12, DEC WRL, 1991.
Kaushik et al., “Efficient transposition algorithms for large matrices.” InProceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 656-665, 1993.
Lin et al. “Efficient representation scheme for multidimensional array operations,”IEEE Transactions on Computers, 51(3):327-345, 2001.
Lam et al., “The cache performand and optimization of blocked algorithms,” InProceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 63-74, 1991.
Muchnick “Advanced Complier Design and Implementation,” Morgan Kaufmann, San Francisco, 1997.
Sen et al. “Towards a theory of cache-efficient algorithms,”Journal of the ACM, 9(6):828-858, 2002.
Swarztrauber “Transposing large arrays in extended memory,” pp. 283-288, 1988.
Wolf and Lam, “A data locality optimizing algorithm,” InProceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pp. 30-44, 1991.
Cormen et al., “Asymptotically tight bounds for performing BMMC permutations on parallel disk systems,”SIAM Journal on Computing, 28(1):105-136, 1998.

Affiliated with

Ruf Erik S.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Chiang Jack

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Klarquist & Sparkman, LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Memula Suresh

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Microsoft Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Memory efficient array transposition via multi pass tiling does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Memory efficient array transposition via multi pass tiling, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Memory efficient array transposition via multi pass tiling will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2806207

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure