Method for determining an optimized memory organization of a...

Computer-aided design and analysis of circuits and semiconductor – Nanotechnology related integrated circuit design

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C717S151000

Reexamination Certificate

active

06449747

ABSTRACT:

FIELD OF THE INVENTION
The invention relates to methods for designing essentially digital devices.
BACKGROUND OF THE INVENTION
An essentially digital device comprises at least of a memory organization (an amount of memories with their sizes and an interconnection pattern) and registers. Such a memory organization is determined during the design process of said digital device. The operation of an essentially digital system can essentially be described as a set of data access operations or instructions on data structures or variables, being stored in said memories.
In [L.Stok, Data path synthesis, integration, the VLSI journal, Vol.18, pp.1-71, June 1994.] register allocation, starting from a fully scheduled flow graph (thus ordered data access operations or instructions are used as input), are presented. Said allocation techniques are scalar oriented. Many of these techniques construct a scalar conflict or compatibility graph and solve the problem using graph coloring or clique partitioning. This conflict graph is fully determined by the schedule which is fixed before. This means that no effort is spent to come up with an optimal conflict graph and thus the potential optimization by reconsidering the schedule is not exploited. Moreover only register allocation is addressed and not memories.
In the less explored domain of memory allocation and assignment for hardware systems, the current techniques start from a given schedule [L.Ramachandran, D.Gajski, V.Chaiyakul, An algorithm for array variable clustering, Proceedings European Design and Test Conference, pp.262-266, Paris, Mar. 1994.], [P.Lippens, J.van Meerbergen, W.Verhaegh, A.van der Werf, Allocation of multiport memories for hierarchical data streams, Proceedings IEEE International Conference on Computer-Aided Design, pp.728-735, Santa Clara, Nov. 1993.], [O.Sentieys, D. Chillet, J. P.Diguet, J.Philippe, Memory module selection for high-level synthesis, Proceedings IEEE workshop on VLSI signal processing, Monterey Calif., October 1996.] or perform first a bandwidth estimation step [F.Balasa, F.Catthoor, H.DeMan, “Dataflow-driven memory allocation for multi-dimensional processing systems,” Proceedings IEEE International Conference on Computer Aided Design}, San Jose Calif., Nov. 1994.] which is a kind of crude ordering that does not really optimize the conflict graph either. These techniques have to operate on groups of signals instead of on scalars to keep the complexity acceptable.
In the parallel compiler domain [M.Al-Mouhamed, S.Seiden, A heuristic storage for minimizing access time of arbitrary data patterns, IEEE Transactions on Parallel and Distributed Systems, Vol.8, No.4, pp.441-447, Apr. 1997.] proposes a technique to partition arrays into groups of data that have to be assigned to different memories such that they can be accessed simultaneously for an SIMD architecture. They combine the constraints of a number of given access patterns into a single linear address transformation that calculates for every data element the memory in which it should be stored to minimize the total access time. This technique allows to avoid the allocation of multi-port memories for storing data with self-conflicts, by explicitly splitting arrays into smaller arrays that can be assigned to single port memories. However said method does not exploit all optimization opportunities for instance by rescheduling data access instructions.
[S.Pinter, Register allocation with instruction scheduling: a new approach, ACM SIGPLAN Notices, Vol.28, pp.248-257, Jun. 1993.] optimizes a conflict graph in the context of scalar register allocation by removing weighted edges in a coloring problem prior to scheduling. However, the conflicts in their initial conflict graph are determined by the sequential ordering of the input code. Also this idea was not applied to groups of scalars.
The Improved Force Directed Scheduling (IFDS) [W.Verhaegh, P.Lippens, E.Aarts, J.Korst, J.van Meerbergen, A.van der Werf, Improved force-directed scheduling in high-throughput digital signal processing, IEEE Transactions on CAD and Systems, Vol.14, No.8, Aug. 1995.] shows a method wherein scheduling intervals are gradually reduced until the desired result is obtained. The cost function used to determine which scheduling interval has to be reduced at each iteration only takes the number of parallel data accesses to reduce the required memory bandwidth into account. (1)FDS does not take into account which data is being accessed. Balancing the number of simultaneous data accesses is a local optimization which can be very bad globally. In IFDS all data is treated equally, although in practice some simultaneous data accesses are more expensive in terms of memory cost than other. Also the required number of memories cannot be estimated accurately by looking locally only, as is done in IFDS, because all conflicts have to be considered for this.
SUMMARY OF THE INVENTION
One aspect of the invention includes a method of determining optimized scheduling intervals and optimized access conflicts useful for determining an optimized memory organization of an essentially digital device, the method comprising: determining an initial scheduling of the data access instructions for a plurality of disjunct blocks, wherein each of the blocks include part of the data access instructions, and wherein at least one of the blocks is executed a plurality of times that is defined by an iteration count, deriving from the initial scheduling an initial block cycle budget for each block, while the overall cycle budget for performance of the digital device is larger than a predetermined overall cycle budget, repeating the method comprising: (a) for substantially all of the blocks, performing the method comprising: temporarily reducing a block cycle budget for a selected block by a predetermined amount; determining optimized scheduling intervals of the data access instructions such that the performance of the digital device is guaranteed to be within the block cycle budgets, wherein determining the optimized scheduling intervals comprises optimizing access conflicts with respect to an evaluation criterion related to the memory cost of the digital device; computing the overall cycle budget resulting from the optimized scheduling intervals; and (b) reducing the block cycle budget for at least one selected block, the selection of the block being based at least in part upon the memory cost and an overall cycle budget reduction.
Another aspect of the invention includes a method of determining a cost-cycle budget curve for an essentially digital device that is represented by a digital representation describing the functionality of the digital device, the representation comprising data access instructions on basic groups of scalar signals, the data access instructions having scheduling intervals, the representation comprising a plurality of disjunct blocks, each block including part of the data access instructions, at least one block being executed at least a plurality of times that is indicated by an iteration count, the method comprising: generating a cost-cycle budget curve that compares the cost of a memory organization of the digital device versus the cycle budget, wherein the cost-cycle budget curve is incrementally generated.
Another aspect of the invention includes a method of determining an optimized memory organization of an essentially digital device represented by a representation describing the functionality of the digital device, the representation comprising data access instructions on basic groups of scalar signals, the data access instructions having scheduling intervals, the representation comprising a plurality of disjunct blocks, each block including part of the data access instructions, at least one block being executed at least a plurality of times, the optimized memory organization being such that execution of the functionality with the digital device is guaranteed to be within a predetermined overall cycle bu

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for determining an optimized memory organization of a... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for determining an optimized memory organization of a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for determining an optimized memory organization of a... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2896141

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.