Method for deadlock-free configuration of dataflow...

Computer-aided design and analysis of circuits and semiconductor – Nanotechnology related integrated circuit design

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C716S030000, C710S010000, C710S053000, C710S054000, C710S055000, C710S056000, C710S057000, C713S001000, C713S100000

Reexamination Certificate

active

06571381

ABSTRACT:

BACKGROUND INFORMATION
1. Related Art
The related art on which this patent specification is based is described in Patent Application 196 54 846.2-53 (Method of automatic dynamic reloading of dataflow processors (DFPs) and modules having a two- or multidimensional programmable cell matrix (FPGAs, DPGAs, etc.) and in Patent Application 196 54 593. 5-53 (Run-time reconfiguration method for programmable modules). A method of configuring and reconfiguring DFPs, as well as FPGAs, DPGAs and similar modules according to the related art in which a separately configured central higher-order microcontroller-like module assumes the task of distribution of configuration data to a plurality of lower-order, mostly passive control units is described in these documents.
2. Disadvantages of the Related Art
By using a central global unit which controls the reconfiguration of parts (e.g. cells (CELs)) of one or more modules, bottlenecks occur when a plurality of different reconfiguration requests are to be handled at the same time. The advantages of the parallelism of the above-described modules are considerably limited by such a central unit, since it represents the typical bottleneck and substantially slows down the processing of data.
Furthermore, assigning the event source to the configuration to be loaded represents a problem because absolute addresses of the configuration memory are used. The reconfiguration unit must therefore contain a type of memory management system which, like in an operating system, also documents which memory area is used by which configuration.
Management of resources (e.g. CELs) represents an additional problem. It must be ensured that each CEL is assigned exactly once to each algorithm started by a reconfiguration request and, specifically, to the one that also uses the remaining surrounding CELs; otherwise deadlocks may occur.
In order to elucidate the problem of reconfiguration again, the following example is given: a matrix of CELs is reconfigured and in the RESET state. Each CEL is capable of indicating whether it is in a reconfigurable state. All CELs in the matrix are ready to be configured; thus they are in a reconfigurable state. A first configuration routine (KR
1
) is loaded; the matrix is not fully utilized. The configured CELs clear the indication that they are in a configurable state. A second configuration routine (KR
2
) independent of the first one is loaded in a group of not yet configured CELs. A third configuration cannot be loaded, since this requires CELs of the first and/or second configuration routine (KR
3
); however these are not in a reconfigurable state as they are being used.
KR
3
must be stopped until the required CELs are released, i.e., KR
1
and KR
2
are terminated.
During the execution of KR
1
and KR
2
, a load request for a fourth configuration routine (KR
4
) and a fifth configuration routine (KR
5
) arrives, which cannot all be loaded immediately, because they use CELs that are being used by KR
1
and KR
2
. KR
3
and KR
4
partially use the same CELs; KR
5
uses none of the CELs of KR
3
and KR
4
.
In order to properly reload KR
3
-KR
5
, the following requirements must be met:
1. KR
3
-KR
5
should be loaded in the order of the load requests if possible.
2. As many KRs as possible that are independent of one another, i.e., have no common CELs, should be loaded in order to achieve maximum parallelism.
3. The KRs should not block one another, i.e., KR
3
is partially loaded but cannot be loaded any further since other CELs are blocked by the partially loaded KR
4
; while KR
4
also cannot be loaded further since again required CELs are blocked a by KR
3
. This results in a typical deadlock situation.
4. The compiler which generated the KRs cannot recognize and cancel the interaction over time of the KRs so that no conflict situation arises.
The ratio between the cost of a circuit to be implemented and an optimum result should be as good as possible, i.e., the object of the invention is to provide a flexible, parallel, deadlock-free configuration that can be executed using moderate time and computing resources at a low cost. In this context the following basic problems must be solved:
if only KR
3
were to be loaded, the process would be deadlock free but not optimum since KR
5
could also be loaded.
if KR
3
is loaded but KR
4
is not, and KR
5
[is], KR
4
must be pre-marked so that it has the highest priority in a subsequent loading sequence, which means high overhead.
Deadlock-free operation is ensured by the following procedure:
Improvements Through and Object of the Invention
The basic object of the present invention is a unit, hereinafter referred to as configuration table (CT), which has a hierarchical structure and may occur several times at each level, the number of CTs from the lowest hierarchical level to the highest diminishing so that exactly one CT is present at the highest level. Each CT configures and controls independently from others and in parallel a plurality of configurable elements (CELs). CTs of the higher hierarchical levels can buffer configuration routines for lower-level CTs. If more than one lower-level CT requires the same configuration routine, it is buffered in a higher-level CT and retrieved by the individual CTs, the higher-level CT retrieving the respective configuration routine only once from a global common configuration memory whereby a cache effect is achieved. In addition to configurable modules, the present invention can be used as a cache procedure for instruction and data cache in microprocessors, DFP or the like having a plurality of arithmetic units. Some of the units to be described below may be omitted depending on the application (e.g., FILMO) however, basically nothing is changed in the hierarchical structure. Therefore this application is considered a subset and is not described in detail. One considerable advantage of the method described over conventional cache procedures is that data and/or codes are cached selectively, i.e., using methods adapted accurately to the algorithm.
The present invention also allows large cell structures to be reconfigured in a completely deadlock-free manner.
DESCRIPTION OF THE INVENTION
Instead of integrating, as previously, a central and global unit in one module, with this unit processing all the configuration requests, there is a plurality of hierarchically (tree structure) arranged active units which can assume this task.
A request from the lowest level (the leaves in the hierarchy) is forwarded to the next higher level only if the request could not be processed. These steps are repeated for all the levels present until the highest level is reached.
The highest level is connected to an internal or external higher-level configuration memory which contains all the configuration data required by this program run.
Due to the tree structure of the configuration units a kind of caching of the configuration data is achieved. Accesses to configurations mainly occur locally. In the most unfavorable case, a configuration must be loaded from the higher-level configuration memory if the corresponding data is not available in any of the hierarchically arranged CTs.
Deadlocks are avoided in that a fixed time sequence of the configurations to be loaded is introduced and the configurations are combined to form a list. The status information of the CELs is saved prior to loading and thus remains unchanged during the processing of the entire list of configurations.
Basic Principle of the CT
A configuration table (CT) is an active unit which responds to sync signals, known as triggers. The triggers are generated by a two- or multi-dimensional matrix of electronic components usually for arithmetic or logical units, address generators, arithmetic units, and the like, hereinafter referred to as configurable elements (CEL). The trigger that occurs triggers a certain action within the CT. The task of the CT is to assume the control of a plurality of CELs and to determine their arithmetic and/or logical operations. In particular, CELs must be configured an

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for deadlock-free configuration of dataflow... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for deadlock-free configuration of dataflow..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for deadlock-free configuration of dataflow... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3064612

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.