Programmatic iteration scheduling for parallel processors

Data processing: software development – installation – and managem – Software program development tool – Translation of code

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06438747

ABSTRACT:

TECHNICAL FIELD
The invention relates to parallel compiler technology, and specifically relates to iteration scheduling in which iterations in a nested loop are assigned and scheduled to execute on processor elements in a processor array.
BACKGROUND
Parallel compilers are used to transform a computer program into parallel code that runs on multi-processor systems. Traditionally, software developers design the compiler to optimize code for a fixed type of hardware. A principal objective of the compiler is to organize the computations in the program so that sets of computational tasks in the program may be executed concurrently across multiple processors in the specified hardware architecture.
Parallel compiler technology extends across a broad range of parallel computer architectures. For example, the multi-processor architecture may employ shared memory in which each processor element shares the same memory space, or distributed memory in which each processor has a local memory.
One area of compiler and computer architecture research focuses on optimizing the processing of computer programs with loop nests. Many computational tasks in software applications are expressed in the form of a multi-nested loop with two or more loops around a block of code called the loop body. The loop body contains a series of program statements, typically including operations on arrays whose elements are indexed by functions of loop indices. Such loop nests are often written in a high level programming language code in which the iterations are ordered sequentially. The processing of the loop nest may be optimized by converting the loop nest code to parallel processes that can be executed concurrently.
One way to optimize loop nest code is to transform the code into a parallel form for execution on an array of processor elements. The objective of this process is to assign iterations in the loop nest to processor elements and schedule a start time for each iteration. The process of assigning iterations to processors and scheduling iterations is a challenging task. Preferably, each iteration in the loop nest should be assigned a processor and a start time so that each processor is kept busy without being overloaded.
SUMMARY OF THE INVENTION
The invention provides an efficient method for programmatically scheduling iterations in a sequential loop nest for execution on a parallel array of processor elements. This scheduling method optimizes the use of each processor element without overloading it.
One aspect of the invention is a programmatic method for determining an iteration schedule for a parallel processor array such that the schedule satisfies an initiation interval constraint. The initiation interval is a measure of throughput that represents the shortest time interval between the initiation of successive iterations of the nested loop on a processor element. The scheduling method accepts a mapping of iterations of the nested loop to processor elements in a processor array. Based on this mapping and a specified initiation interval, the method programmatically determines a definition of iteration schedules. Each schedule satisfies the constraint that no more than one iteration is started on a processor element for each initiation interval. This method is implemented so as to keep each processor element as busy as possible. The schedules start a new iteration on each processor element nearly every initiation interval.
Another aspect of the invention is a programmatic method for determining a set of iteration schedules that satisfy a specified resource constraint and data flow dependencies among operations in a nested loop. The implementation of this method uses linear programming to guide the selection of iteration schedules such that the selected schedules are likely to satisfy data flow dependencies.
The invention provides an efficient and direct method for generating a set of schedules. This method enables the iteration scheduler to select one or more schedules that satisfy desired constraints. Since the scheduler quickly and efficiently provides iteration schedules, it can evaluate each schedule and select one or more that yield a parallel processor array with optimized performance or cost.
Further advantages and features of the invention will become apparent in the following detailed description and accompanying drawings.


REFERENCES:
patent: 5230053 (1993-07-01), Zaiki
patent: 5442790 (1995-08-01), Nosenchuck
patent: 5579494 (1996-11-01), Zaiki
patent: 5787272 (1998-07-01), Gupta et al.
patent: 5802375 (1998-09-01), Ngo et al.
patent: 5832272 (1998-11-01), Kalantery
patent: 5852734 (1998-12-01), Komatsu et al.
patent: 5890134 (1999-03-01), Fox
patent: 6016399 (2000-01-01), Chang
patent: 6041181 (2000-03-01), Ju et al.
patent: 6058266 (2000-05-01), Megiddo et al.
patent: 6059841 (2000-05-01), Caracuzzo
patent: 6070011 (2000-05-01), Liu et al.
patent: 6226776 (2001-05-01), Panchul
Darte et al., “Partitioning for Array Processors,” Tech. Rep. 90-23, LIP, ENS Lyon, 1990.
A. Darte, “Regular Partitioning for Synthesizing Fixed-Size Systolic Arrays,” Integration, the VLSI Journal, vol. 12, 1991, pp. 293-304.
X. Chen and G. M. Megson, “A General Methodology of Partitioning and Mapping for Given Regular Arrays,” IEEE Transactions on Parallel and Distributed Systems, vol. 6, No. 10, 1995, pp. 1100-1107.
Rainer Leupers, Peter Marwedel, “Retargetable Generation of Code Selectors from HDL Processor Models,” IEEE, 1997, pp. 140-144.
George Hadjiyiannis, Silvina Hanono, Srinivas Devadas, “ISDL: An Instruction Set Description Language for Retargetability,” ACM, 1997, pp. 299-302.
Gyllenhaal et al., “HMDES Version 2.0 Specification,” Hewlett Packard Laboratories Technical Report Impact-96-3, (Published before Aug. 20, 1999).
Hadiyiannis et al., “A Methodology for Accurate Performance Evaluation in Architecture Exploration.” (Published before Aug. 20, 1999).
Hoogerbrugge et al., “Automatic Synthesis of Transport Triggered Processors.” (Published before Aug. 20 1999).
Corporaal et al., “Move: A Framework for High-Performance Processor Design,” ACM, 1991, pp. 692-701.
Corporaal et al., “Cosynthesis with the Move Framework.” (Published before Aug. 20 1999).
Lanneer et al, “Chapter 5—Chess: Retargetable Code Generation for Embedded DSP Processors,” Code Generation for Embedded Processors, Kluwer Academic Publications, pp. 85-102. (Published before Aug. 20 1999).
Fauth, “Chapter 8—Beyond Tool-Specific Machine Descriptions,” Code Generation for Embedded Processors, Kluwer Academic Publications, pp. 138-152.
Quinton et al.,Systolic Algorithms&Architectures,Prentice Hall, 1991, pp. 283-339.
Park et al., “Shewa: A Software Package for Synthesis of Pipelines from Behavioral Specifications,” IEEE, Mar. 1988, vol. 7, No. 3, pp. 356-370.
Megson et al., “A Synthesis Method of LSGP Partitioning of Given-Shape Regular Arrays,” IEEE, 1995, pp. 234-238.
Chen et al., “A General Methodology of Partitioning and Mapping for Given Regular Arrays,”IEEE Transactions on Parallel and Distributed Systems,vol. 6, No. 10, 1995, pp. 1100-1107.
Bagg et al., “Parallelizing Applications into Silicon,” MIT(Published before Aug. 20 1999).
Rim et al., “Optimal and Heuristic Algorithms for Solving the Binding Problem,” Madison, WI, Sep. 10, 1993.
Weinhardt et al., “Memory Access Optimization and RAM Inference for Pipeline Vectorization,” Proceedings of the 9thInternational Workshop, FPL ′99, pp. 61-70.
Shackleford et al., “Satsuki: An Integrated Processor Synthesis and Compiler Generation System,” IEICE Special Issue on Synthesis and Verification of Hardware Design, 10-96.
Devadas et al., “Algorithms for Hardware Allocation in Data Path Synthesis,” IEEE, Jul. 1989, vol. 8, No. 7, pp. 768-781.
Cloutier et al., “The Combination of Scheduling, Allocation, and Mapping in a Single Algorithm,” 27thACM/IEEE Design Automation Conference, 1990, pp. 71-76.
Wilson et al., “An ILP Solution for Simultaneous Scheduling, Allocation, and Binding in Multiple Block Synthesis,” IEEE, 1994, pp. 581-586.
Wilson et al., “An ILP Solution for Optimum Scheduling,

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Programmatic iteration scheduling for parallel processors does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Programmatic iteration scheduling for parallel processors, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Programmatic iteration scheduling for parallel processors will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2956903

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.