Computer-aided design and analysis of circuits and semiconductor – Nanotechnology related integrated circuit design
Reexamination Certificate
2000-05-17
2003-02-04
Smith, Matthew (Department: 2825)
Computer-aided design and analysis of circuits and semiconductor
Nanotechnology related integrated circuit design
36, 36, 36
Reexamination Certificate
active
06516453
ABSTRACT:
BACKGROUND OF THE PRESENT INVENTION
1. Technical Field
The present invention relates to electronic design automation and more particularly to timing analysis during automatic scheduling of operations in the high-level synthesis of digital systems.
2. Description of the Prior Art
High-level synthesis (HLS) automates certain subtasks of a digital system design in an electronic design automation (EDA) system. A system architect begins by designing and validating an overall algorithm to be implemented, e.g., using C, C++, a specialized language, or a capture system. The resulting architectural specification is partitioned into boards, chips, and blocks. Each block is a single process having its own control flow. There are usually tens to hundreds of such blocks in a modem large-scale chip design. Typical blocks represent whole filters, queues, pipeline stages, etc. Once a chip has been partitioned into its constituent blocks, any needed communication protocols have to be constructed. Such protocols depend on cycle-by-cycle communication between blocks.
So-called “scheduling” and “allocation” are applied one block at a time. Scheduling assigns operations such as additions and multiplications to states of a finite-state machine (FSM). Such FSM describes a control flow in an algorithm performed by a block being synthesized. Some operations are locked into particular states, and represent communication with other blocks. These input/output operations cannot be moved, or rescheduled, from one state to another, because to do so would probably upset the block-to-block communication protocol.
However, some other operations can be moved from one state to another. Moving operations from states that have many operations to states that have few allows hardware resources to be more evenly shared among operations. Timing problems can sometimes be resolved by moving certain operations from states in which operation delays cause timing problems into states in which such problems don't exist.
Allocation maps the operations of a scheduled FSM to particular hardware resources. For example, three addition operations can be scheduled to only require a single adder. An appropriate adder is constructed, and the operations are assigned to the adder. But complications can arise when more than one hardware resource of a given bit-width and function is needed. And so which resource to use for each operation must be decided. Considerations include multiplexing cost, the creation of false timing paths, register assignment, and even using a large resources for small operations. Hardware resources can be used for multiple functions. Calculating a minimum set of resources for an entire process is difficult but rewarding. Sometimes alternative implementations will be possible. It is often possible to choose implementations that meet the overall timing constraints and minimize the gate count. Resource allocation also includes mapping resources (abstract functions) to gate level implementations.
Allocation includes calculating a register set and assigning data to registers for use in later states. For example, temporary variables are used to store intermediate results in a larger calculation. But the contents of such temporary variables could share a common register in different states. The contents are only needed in one state each. So it is possible to save on hardware by assigning the data that needs to be stored to such storage elements. But register and storage allocations can be complicated if data values can form mutually exclusive sets or can share storage. Data values often drive functional resources, and in turn are often produced by functional resources. A good assignment of data to storage will result in reduced multiplexing costs and delays. The allocation is also made more complex if any register and functional hardware interact.
Technology-independent, or Boolean, optimization follows scheduling and allocation. The circuit design comprises generic AND and OR gates connected in a netlist. Technology-independent optimization minimizes the number of literals in the netlist. An abstraction of Boolean gates lends itself to a highly mathematical treatment based on Boolean arithmetic. For example, the Boolean identity AB+AC=A(B+C) can be used to reduce the corresponding gate network.
Technology mapping follows Boolean optimization, the abstract Boolean gates of the circuit are mapped to standard cells from a technology library. Standard library cells include simple AND, OR, or NOT functions, and much more complex functions. For example, full adders, and-or-invert gates, and multiplexers. Technology-library gates are available in a variety of drive strengths, delays, input loadings, etc. Technology mapping is made more complex by the fact that there are many ways to map an individual Boolean gate, and each way having its own unique advantages.
Technology mapping can sometimes be avoided by constructing custom gate layouts for the gates of a circuit, instead of selecting cells from a library of preconstructed and precharacterized cells. But this method is not commonly associated with automatic synthesis.
The layout tasks of cell placement and net routing follow technology mapping, the physical position of each cell on the chip is established (placement), and the nets necessary to interconnect the cells are laid out (routing). In application service provider
104
, the design intellectual property is downloaded to the user for placing and routing.
The usual input for an HLS system is a VHDL or Verilog process. Single conceptual units are represented with single threads of control, well-defined input and output operations, and a well-defined sequences of operations that define behavior. The output of the typical HLS system includes three interlinked parts and its input behavior. First, a finite-statemachine (FSM) comprising a finite set of states “S”, an alphabet of input symbols “I”, an alphabet of output symbols “O”, and a transition function “F” mapping (S×I)->(S×0). This is the so-called Mealy representation of an FSM. Another common representation, the Moore representation, is logically equivalent. Second, a resource graph, comprising a collection of hardware resources, which are abstract representations of registers and combinational logic elements, and interconnections between the resources. Third, a mapping of the process's operations and data values to the states and alphabets of the FSM and to the resources. This mapping can be thought of either as two mappings. E.g., operations and data to the FSM, and operations and data values to resources. Or as a single ternary mapping whose tuples are of the form operation, symbol, resource, or value, state, resource. Each three-tuple of the ternary mapping can be thought of as describing a linkage between an operation or data value, a state or transition of the FSM, and a register or combinational resource. In other words, what, when, and where.
Event control statements like “@ (posedge clock)” in Verilog directly map to corresponding states in the FSM. The control flow between event control statements maps directly to state transitions, or “arcs”. Statements like “c=a+b” that occur along a control flow between two event control statements can be mapped onto corresponding FSM arcs, e.g., as operation annotations. HLS systems effectively take Verilog fragments and a technology library as input, and extract a skeleton FSM of states and arcs. Then it extracts the operations and links them to the FSM. The HLS system constructs a resource set, e.g., registers to contain the values, adders, comparators, etc. The HLS system can then construct tuples that describe the what-when-where linkage between the operations, the FSM, and the resources. For example, the Verilog fragment:
input [7:0] a, b;
output [7:0] c;
always begin: process reg [7:0] x, y;
@ (posedge clock);// s
1
if (x>y) begin
c=a+b
end else begin
@ (posedge clock); // s
2
end
end
Get2Chip
Glenn Michael A.
Liu Andrea
Smith Matthew
Wong Kirk D.
LandOfFree
Method for timing analysis during automatic scheduling of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for timing analysis during automatic scheduling of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for timing analysis during automatic scheduling of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3174676