Electrical computers and digital processing systems: processing – Processing architecture – Vector processor
Reexamination Certificate
2011-01-25
2011-01-25
Vy, Hung T (Department: 2163)
Electrical computers and digital processing systems: processing
Processing architecture
Vector processor
C707S764000
Reexamination Certificate
active
07877573
ABSTRACT:
One embodiment of the present invention sets forth a technique for computing a parallel prefix sum using one or more cooperative thread arrays (CTA) within a graphics processing unit. The prefix sum input list is partitioned and distributed to each CTA. Within each CTA, the input list is further partitioned for processing by individual threads in a way that avoids access conflicts to memory. Each list partition within the CTA is assigned to one of a plurality of concurrent threads, which executes a prefix sum operation the partition. The final values of the prefix sum operations form a list that is then subjected to a second prefix sum operation. Each element of the second prefix sum operation is added to each element of the subsequent partition, completing the prefix sum operation within the CTA. This technique may be extended to prefix sum operations that span two or more CTAs.
REFERENCES:
patent: 5799300 (1998-08-01), Agrawal et al.
patent: 5890151 (1999-03-01), Agrawal et al.
patent: 7496921 (2009-02-01), Mehta
patent: 7506136 (2009-03-01), Stuttard et al.
patent: 7584342 (2009-09-01), Nordquist et al.
patent: 2002/0062435 (2002-05-01), Nemirovsky et al.
patent: 2007/0260663 (2007-11-01), Frigo et al.
patent: 2008/0052689 (2008-02-01), Archambault et al.
patent: 2008/0140994 (2008-06-01), Khailany et al.
patent: 2008/0184017 (2008-07-01), Stuttard et al.
patent: 2008/0184211 (2008-07-01), Nickolls et al.
patent: WO 2008/127610 (2008-10-01), None
patent: WO 2008/127622 (2008-10-01), None
patent: WO 2008/127623 (2008-10-01), None
Eggers, et al., “Simultaneous Multithreading: A Platform for Next-Generation Processors,”IEEE Micro, vol. 17, No. 5, pp. 12-19, Sep./Oct. 1997.
Office Action. U.S. Appl. No. 11/836,027 dated Sep. 25, 2009.
Notice of Allowance U.S. Appl. No. 11/836,027, dated Jan. 11, 2010.
Nvidia Corporation
Patterson & Sheridan LLP
Vy Hung T
LandOfFree
Work-efficient parallel prefix sum algorithm for graphics... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Work-efficient parallel prefix sum algorithm for graphics..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Work-efficient parallel prefix sum algorithm for graphics... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2720059