Timing closure
Timing closure in VLSI design and electronics engineering is the iterative design process of assuring all electromagnetic signals satisfy the timing requirements of logic gates in a clocked synchronous circuit, such as timing constraints, clock period, relative to the system clock. The goal is to guarantee correct data transfer and reliable operation at the target clock frequency.
A synchronous circuit is composed of two types of primitive elements: combinatorial logic gates, which process logic functions without memory, and sequential elements, which can store data and are triggered by clock signals. Through timing closure, the circuit can be adjusted through layout improvement and netlist restructuring to reduce path delays and make sure the signals of logic gates function before the required timing of clock signal.
As integrated circuit designs become increasingly complicated, with billions of transistors and highly interconnected logic. The mission of ensuring all critical timing paths satisfy their constraints has become more difficult. Failed to meet these timing requirements can cause functional faults, unpredictable consequence, or system-level failures.
For this reason, timing closure is not a simple final validation step, but rather an iterative and comprehensive optimization process. It involves continuous improvement of both the logical structure of the design and its physical implementation, such as adjusting gate's logical structure and refining placement and routing, in order to reliably meet all timing constraints across the entire chip.
Overview
In simple cases, the user can compute the path delay between elements manually. If the design is more than a dozen or so elements this is impractical. For example, the time delay along a path from the output of a D flip-flop, through combinatorial logic gates, then into the next D flip-flop input must satisfy the time period between synchronizing clock pulses to the two flip flops. When the delay through the elements is greater than the clock period, the circuit will not function. Therefore, modifying the circuit to remove the timing failure is an important part of the logic design engineer's task. Critical path refers to the longest path between two sequential elements in a design. It also defines the maximum delay in all the multiple register-to-register paths, and it must not be greater than the clock cycle time.Timing constraints
In the process of IC design, the IC layout should satisfy geometric constraints and timing constraints. Geometric constraints refer to physical design regulations and rules imposed by the assembly process, such as correct cell alignment and minimum wire spacing. Timing constraints refer to the timing requirements that all signal paths should satisfy. Usually, before the output of the signal from flip-flop at the clock edge, the signal should also remain stable in the element for a period, which is called setup time. After the electromagnetic signal reaches the next flip-flop at the clock edge, the signal should remain stable in the storage element for some time, which is called hold time. The timing constraints have two types:Setup 'constraints :
These constraints specify the time length before the clock edge of flip-flop where the data input signal should stay steady, so that the data has enough time to propagate through a logic path and reach the next flip-flop before the next clock edge. If the path delay is too long, it may violate setup time constraints and cause problematic data to be latched.
Hold constraints :
These constraints specify the time length after the clock edge of flip-flop where the data input signal should stay stable. Violating a hold constraint can result in metastability or unwanted behaviors.
Hold time constraint:
Setup time constraint:
Where:'
- = combinational logic delay
- = clock period
- = setup time
- = hold time
- = clock-to-Q delay of the flip-flop
Timing closure iterative process
Because FPGAs have flexible logic and wiring, signal delays can vary. If signals arrive too late, the design may fail timing. The timing constraints designers begin to define accurate and realistic timing constraints that reflect the system's performance goals in the SDC format. These constraints may include clock period, input/output delays, multi-cycle paths, and setup/hold requirements. It's critical to analyze whether they are achievable, based on the logic architecture and path delays within the design. These constraints guide all downstream timing analysis and optimization processes.
Problems in timing closure and static timing analysis
There are three main delays in the clocked synchronous circuit that are primarily considered:Gate delays is the length of time it takes for a change in a gate's input to propagate to the output. It's often calculated as the time between a change at the input and the resulting change at the output.
Wire delays is also known as interconnect delay, meaning the time that takes for a data signal to propagate through metal wires between circuit element in a synchronous circuit. The delay is mostly caused by the resistance and capacitance of the wire.
Clock skew is the difference in arrival time of the same sourced clock signal at different parts of a synchronous circuit. When the clock signal propagates from its source, such as oscillator or clock generator, through many different paths in the circuit, the signal experience propagation delay, which caused the clock skew. In the graph below, the clock skew between points i and j is on a chip: While position i and j can vary. The diagram illustrates the concept of clock skew, which refers to the difference in clock arrival times at different flip-flops on a chip. Ideally, all clock signals should reach their destinations simultaneously; however, due to variations in routing, load, and physical placement, this is rarely achieved.
After logic synthesis and constraints analysis, the design undergoes static timing analysis, which is a fundamental, iterative process in validating whether the circuit meets its defined timing constraints in FGPA. STA tools can evaluate all timing paths in the design without requiring simulation, making them ideal for scalable and exhaustive analysis. In STA, the combinational circuit can represent as directed acyclic graph which emphasizes that every node has weight is the same as the wire delay.
During this process, the STA engine computes:
- Path delays: Total delay from one register to another through combinational logic.
- Slack: The difference between required arrival time and actual arrival time.
- Critical paths: The longest paths with the smallest slack.
- Violations: Paths with negative slack, meaning they fail to meet timing.
Where:
- RAT = required arrival time
- AAT = actual arrival time
Physical design
Once the STA reports are generated, engineers can utilize timing optimization techniques, or design automation tools, to examine them to identify the critical or failing paths that need attention. They also optimize the physical layout by adjusting placement and routing. This loop repeats until all timing constraints are met.Through logic synthesis and initial timing optimization, the physical layout of the chip should be mapped. Through placement, clock tree synthesis, and routing of these key steps, the physical designs are altered so that the timing behaviors can change significantly, and therefore reduce the path delays and enhance the timing in circuit.
1. Placement
The EDA tool assigns physical locations to each standard cell and wire on the silicon circuit board. It can reduce path delays by placing interconnected cells close to each other.2. Clock tree synthesis (CTS)
A balanced clock distribution network is built to deliver the clock signal to all sequential elements evenly and synchronously. The CTS can minimize clock skew and can precisely control the clock latency, while satisfying the maximum transition and maximum capacitance to ensure the clock network meet design constraints. The clock skew usually affects hold time and setup time, and the clock skew is usually composed of local clock skew and global clock skew.Commonly there are three types of CTS:
2.1.Single point CTS
A single point clock tree starts off from a single clock source and delivers the clock signal to all sequential elements in a tree structure. This method is easy to implement and is appropriate for low-frequency or multi-clock designs. Nevertheless, it will be unsuitable for high-frequency or large-scale designs because path asymmetry can lead to larger clock skew.
2.2.Clock mesh
A clock mesh dispatches the clock signal through a grid-like structure, providing enhanced clock balance and lesser skew, which is good for high-frequency designs. However, constructing a clock mesh means higher power and area overhead, and the design complexity will be increased.
2.3.Multi-source CTS
A Multi-source clock tree integrates the advantages of single-point trees and clock meshes. The design is partitioned into multiple components, each with its own local clock source. This clock tree achieves low skew while reducing power and area consumption, making it well-suited for large-scale designs.